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Preface 


Thank you for choosing Toshiba semiconductor products. This is the year 2000 edition of the user’s 
manual for the architecture of the TX79 RISC microprocessor core, a member of the TX System RISC 


Family of Toshiba microprocessors. 


This user’s manual is designed to be easily understood by engineers who are designing a Toshiba 
microprocessor into their products for the first time. No special knowledge of this architecture is 
assumed — the contents includes basic information about the architecture of the TX79 microprocessor 


core as well as more advanced, in-depth description. 


Toshiba are continually updating technical publications. Any comments and suggestions regarding any 
Toshiba document are most welcome and will be taken into account when subsequent editions are 
prepared. To receive updates to the information in this manual, or for additional information about this 


architecture, please contact your nearest Toshiba office or authorized Toshiba dealer. 


April 2001 
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TOSHIBA 1 Using Toshiba Semiconductors Safely 


Using Toshiba Semiconductors Safely 


TOSHIBA is continually working to improve the quality and the reliability of its products. 


Nevertheless, semiconductor devices in general can malfunction or fail due to their inherent 
electrical sensitivity and vulnerability to physical stress. It is the responsibility of the buyer, when 
utilizing TOSHIBA products, to observe standards of safety, and to avoid situations in which a 
malfunction or failure of a TOSHIBA product could cause loss of human life, bodily injury or 
damage to property. 


In developing your designs, please ensure that TOSHIBA products are used within specified 
operating ranges as set forth in the most recent products specifications. Also, please keep in mind 
the precautions and conditions set forth in the TOSHIBA Semiconductor Reliability Handbook. 
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TOSHIBA 2 Safety Precautions 


2. Safety Precautions 


This section lists important precautions which users of semiconductor devices (and anyone else) 
should observe in order to avoid injury and damage to property, and to ensure safe and correct use 
of devices. 


Please be sure that you understand the meanings of the labels and the graphic symbol described 
below before you move on to the detailed descriptions of the precautions. 


[Explanation of labels] 


A DANGER Indicates an imminently hazardous situation which will result in death or 
serious injury if you do not follow instructions. 


Indicates a potentially hazardous situation which could result in death or 
AWARNING serious injury if you do not follow instructions. 


ACAUTION Indicates a potentially hazardous situation which if not avoided, may result 
in minor injury or moderate injury. 


[Explanation of graphic symbol] 


Indicates that caution is required (laser beam is dangerous to eyes). 


TOSHIBA 2 Safety Precautions 


2.1 General Precautions regarding Semiconductor Devices 


ACAUTION 


Do not use devices under conditions exceeding their absolute maximum ratings (e.g. current, voltage, power dissipation or 
temperature). 
This may cause the device to break down, degrade its performance, or cause it to catch fire or explode resulting in injury. 


Do not insert devices in the wrong orientation. 

Make sure that the positive and negative terminals of power supplies are connected correctly. Otherwise the rated maximum 
current or power dissipation may be exceeded and the device may break down or undergo performance degradation, causing it to 
catch fire or explode and resulting in injury. 


When power to a device is on, do not touch the device’s heat sink. 
Heat sinks become hot, so you may burn your hand. 


Do not touch the tips of device leads. 
Because some types of device have leads with pointed tips, you may prick your finger. 


When conducting any kind of evaluation, inspection or testing, be sure to connect the testing equipment’s electrodes or probes to 
the pins of the device under test before powering it on. 
Otherwise, you may receive an electric shock causing injury. 


Before grounding an item of measuring equipment or a soldering iron, check that there is no electrical leakage from it. 
Electrical leakage may cause the device which you are testing or soldering to break down, or could give you an electric shock. 


Always wear protective glasses when cutting the leads of a device with clippers or a similar tool. 
If you do not, small bits of metal flying off the cut ends may damage your eyes. 
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2.2 Precautions Specific to Each Product Group 


2.2.1 Optical semiconductor devices 


When a visible semiconductor laser is operating, do not look directly into the laser beam or look through the optical system. 
This is highly likely to impair vision, and in the worst case may cause blindness. 

If it is necessary to examine the laser apparatus, for example to inspect its optical characteristics, always wear the appropriate 
type of laser protective glasses as stipulated by IEC standard IEC825-1. 


AWARNING 


Ensure that the current flowing in an LED device does not exceed the device’s maximum rated current. 
This is particularly important for resin-packaged LED devices, as excessive current may cause the package resin to blow up, 
scattering resin fragments and causing injury. 


When testing the dielectric strength of a photocoupler, use testing equipment which can shut off the supply voltage to the 
photocoupler. If you detect a leakage current of more than 100 yA, use the testing equipment to shut off the photocoupler’s 
supply voltage; otherwise a large short-circuit current will flow continuously, and the device may break down or burst into flames, 
resulting in fire or injury. 


When incorporating a visible semiconductor laser into a design, use the device’s internal photodetector or a separate 
photodetector to stabilize the laser’s radiant power so as to ensure that laser beams exceeding the laser’s rated radiant power 
cannot be emitted. 

If this stabilizing mechanism does not work and the rated radiant power is exceeded, the device may break down or the 
excessively powerful laser beams may cause injury. 


2.2.2 Power devices 


Never touch a power device while it is powered on. Also, after turning off a power device, do not touch it until it has thoroughly 
discharged all remaining electrical charge. 

Touching a power device while it is powered on or still charged could cause a severe electric shock, resulting in death or serious 
injury. 


When conducting any kind of evaluation, inspection or testing, be sure to connect the testing equipment’s electrodes or probes to 
the device under test before powering it on. 

When you have finished, discharge any electrical charge remaining in the device. 

Connecting the electrodes or probes of testing equipment to a device while it is powered on may result in electric shock, causing 
injury. 
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Do not use devices under conditions which exceed their absolute maximum ratings (current, voltage, power dissipation, 
temperature etc.). 

This may cause the device to break down, causing a large short-circuit current to flow, which may in turn cause it to catch fire or 
explode, resulting in fire or injury. 


Use a unit which can detect short-circuit currents and which will shut off the power supply if a short-circuit occurs. 
If the power supply is not shut off, a large short-circuit current will flow continuously, which may in turn cause the device to catch 
fire or explode, resulting in fire or injury. 


When designing a case for enclosing your system, consider how best to protect the user from shrapnel in the event of the device 
catching fire or exploding. 
Flying shrapnel can cause injury. 


When conducting any kind of evaluation, inspection or testing, always use protective safety tools such as a cover for the device. 
Otherwise you may sustain injury caused by the device catching fire or exploding. 


Make sure that all metal casings in your design are grounded to earth. 

Even in modules where a device’s electrodes and metal casing are insulated, capacitance in the module may cause the 
electrostatic potential in the casing to rise. 

Dielectric breakdown may cause a high voltage to be applied to the casing, causing electric shock and injury to anyone touching it. 


When designing the heat radiation and safety features of a system incorporating high-speed rectifiers, remember to take the 
device’s forward and reverse losses into account. 

The leakage current in these devices is greater than that in ordinary rectifiers; as a result, if a high-speed rectifier is used in an 
extreme environment (e.g. at high temperature or high voltage), its reverse loss may increase, causing thermal runaway to occur. 
This may in turn cause the device to explode and scatter shrapnel, resulting in injury to the user. 


A design should ensure that, except when the main circuit of the device is active, reverse bias is applied to the device gate while 
electricity is conducted to control circuits, so that the main circuit will become inactive. 
Malfunction of the device may cause serious accidents or injuries. 


ACAUTION 


When conducting any kind of evaluation, inspection or testing, either wear protective gloves or wait until the device has cooled 
properly before handling it. 

Devices become hot when they are operated. Even after the power has been turned off, the device will retain residual heat which 
may cause a burn to anyone touching it. 


2.2.3 Bipolar ICs (for use in automobiles) 


ACAUTION 


If your design includes an inductive load such as a motor coil, incorporate diodes or similar devices into the design to prevent 
negative current from flowing in. 

The load current generated by powering the device on and off may cause it to function erratically or to break down, which could in 
turn cause injury. 


Ensure that the power supply to any device which incorporates protective functions is stable. 
If the power supply is unstable, the device may operate erratically, preventing the protective functions from working correctly. If 
protective functions fail, the device may break down causing injury to the user. 
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3. General Safety Precautions and Usage Considerations 


This section is designed to help you gain a better understanding of semiconductor devices, so as to 
ensure the safety, quality and reliability of the devices which you incorporate into your designs. 


3.1. From Incoming to Shipping 


3.1.1 Electrostatic discharge (ESD) 
When handling individual devices (which are not yet mounted on a printed 4 


circuit board), be sure that the environment is protected against 

electrostatic electricity. Operators should wear anti-static clothing, and 

containers and other objects which come into direct contact with devices 

should be made of anti-static materials and should be grounded to earth via «A 
an 0.5- to 1.0-MQ protective resistor. 


Please follow the precautions described below; this is particularly important 
for devices which are marked “Be careful of static.”. 


(1) Work environment 


@ When humidity in the working environment decreases, the human body and other insulators 
can easily become charged with static electricity due to friction. Maintain the recommended 
humidity of 40% to 60% in the work environment, while also taking into account the fact that 
moisture-proof-packed products may absorb moisture after unpacking. 


© Besure that all equipment, jigs and tools in the working area are grounded to earth. 


@ Place a conductive mat over the floor of the work area, or take other appropriate measures, So 
that the floor surface is protected against static electricity and is grounded to earth. The surface 
resistivity should be 10* to 10° Q/sq and the resistance between surface and ground, 7.5 x 10° to 
10°Q 


® Cover the workbench surface also with a conductive mat (with a surface resistivity of 10* to 
10° Q/sq, for a resistance between surface and ground of 7.5 x 10° to 10°Q) . The purpose of this 
is to disperse static electricity on the surface (through resistive components) and ground it to 
earth. Workbench surfaces must not be constructed of low-resistance metallic materials that 
allow rapid static discharge when a charged device touches them directly. 


@ Pay attention to the following points when using automatic equipment in your workplace: 


(a) When picking up !Cs with a vacuum unit, use a conductive rubber fitting on the end of the 
pick-up wand to protect against electrostatic charge. 


(b) Minimize friction on |C package surfaces. If some rubbing is unavoidable due to the device's 
mechanical structure, minimize the friction plane or use material with a small friction 
coefficient and low electrical resistance. Also, consider the use of an ionizer. 


(c) In sections which come into contact with device lead terminals, use a material which 
dissipates static electricity. 


(d) Ensure that no statically charged bodies (such as work clothes or the human body) touch 
the devices. 
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(e) Make sure that sections of the tape carrier which come into contact with installation 
devices or other electrical machinery are made of a low-resistance material. 


(f) Make sure that jigs and tools used in the assembly process do not touch devices. 


(g) In processes in which packages may retain an electrostatic charge, use an ionizer to 
neutralize the ions. 


@ Make sure that CRT displays in the working area are protected against static charge, for 
example by a VDT filter. As much as possible, avoid turning displays on and off. Doing so can 
cause electrostatic induction in devices. 


® Keep track of charged potential in the working area by taking periodic measurements. 


e Ensure that work chairs are protected by an anti-static textile cover and are grounded to the 
floor surface by a grounding chain. (Suggested resistance between the seat surface and 
grounding chain is 7.5 x 10° to 10.) 


¢ Install anti-static mats on storage shelf surfaces. (Suggested surface resistivity is 10* to 10° 
Q/sq; suggested resistance between surface and ground is 7.5 x 10° to 10° Q.) 


© For transport and temporary storage of devices, use containers (boxes, jigs or bags) that are 
made of anti-static materials or materials which dissipate electrostatic charge. 


@ Make sure that cart surfaces which come into contact with device packaging are made of 
materials which will conduct static electricity, and verify that they are grounded to the floor 
surface via a grounding chain. 


@ In any location where the level of static electricity is to be closely controlled, the ground 
resistance level should be Class 3 or above. Use different ground wires for all items of 
equipment which may come into physical contact with devices. 


(2) Operating environment 


® Operators must wear anti-static clothing and conductive shoes (or 
a leg or heel strap). J 


® Operators must wear a wrist strap grounded to earth viaa 
resistor of about 1 MQ. 


© Soldering irons must be grounded from iron tip to earth, and must be used only at low voltages 
(6V to 24V). 


© |f the tweezers you use are likely to touch the device terminals, use anti-static tweezers and in 
particular avoid metallic tweezers. If a charged device touches a low-resistance tool, rapid 
discharge can occur. When using vacuum tweezers, attach a conductive chucking pat to the tip, 
and connect it to a dedicated ground used especially for anti-static purposes (Suggested 
resistance value: 10* to 10° Q). 


© Do not place devices or their containers near sources of strong electrical fields (such as above a 
CRT). 
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@ When storing printed circuit boards which have devices mounted on them, use a board 
container or bag that is protected against static charge. To avoid the occurrence of static charge 
or discharge due to friction, keep the boards separate from one other and do not stack them 
directly on top of one another. 


e Ensure, if possible, that any articles (such as clipboards) which are brought to any location 
where the level of static electricity must be closely controlled are constructed of anti-static 
materials. 


@ In cases where the human body comes into direct contact with a device, be sure to wear anti- 
static finger covers or gloves (suggested resistance value: 10° Q or less). 


© Equipment safety covers installed near devices should have resistance ratings of 10° Q or less. 


© |f awrist strap cannot be used for some reason, and there is a possibility of imparting friction to 
devices, use an ionizer. 


¢ The transport film used in TCP products is manufactured from materials in which static 
charges tend to build up. When using these products, install an ionizer to prevent the film from 
being charged with static electricity. Also, ensure that no static electricity will be applied to the 
product’s copper foils by taking measures to prevent static occuring in the peripheral 
equipment. 


3.1.2 Vibration, impact and stress 


Handle devices and packaging materials with care. To avoid damage 
to devices, do not toss or drop packages. Ensure that devices are not 
subjected to mechanical vibration or shock during transportation. 
Ceramic package devices and devices in canister-type packages which 
have empty space inside them are subject to damage from vibration 
and shock because the bonding wires are secured only at their ends. 


ws 


Vibration 


Plastic molded devices, on the other hand, have a relatively high level 

of resistance to vibration and mechanical shock because their bonding 

wires are enveloped and fixed in resin. However, when any device or package type is installed in 
target equipment, it is to some extent susceptible to wiring disconnections and other damage from 
vibration, shock and stressed solder junctions. Therefore when devices are incorporated into the 
design of equipment which will be subject to vibration, the structural design of the equipment 
must be thought out carefully. 


If a device is subjected to especially strong vibration, mechanical shock or stress, the package or 
the chip itself may crack. In products such as CCDs which incorporate window glass, this could 
cause surface flaws in the glass or cause the connection between the glass and the ceramic to 
separate. 


Furthermore, it is known that stress applied to a semiconductor device through the package 
changes the resistance characteristics of the chip because of piezoelectric effects. In analog circuit 
design attention must be paid to the problem of package stress as well as to the dangers of 
vibration and shock as described above. 
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3.2 


3.2.1 


3.2.2 


Storage 


General storage 


@ Avoid storage locations where devices will be exposed to moisture or direct sunlight. 


© Follow the instructions printed on the device cartons regarding 
transportation and storage. 


© The storage area temperature should be kept within a 
temperature range of 5°C to 35°C, and relative humidity should 
be maintained at between 45% and 75%. 


© Do not store devices in the presence of harmful (especially 
corrosive) gases, or in dusty conditions. 


© Use storage areas where there is minimal temperature fluctuation. Rapid temperature changes 
can cause moisture to form on stored devices, resulting in lead oxidation or corrosion. As a result, 
the solderability of the leads will be degraded. 


@ When repacking devices, use anti-static containers. 
® Do not allow external forces or loads to be applied to devices while they are in storage. 


© |f devices have been stored for more than two years, their electrical characteristics should be 
tested and their leads should be tested for ease of soldering before they are used. 


Moisture-proof packing 


Moisture-proof packing should be handled with care. The handling 
procedure specified for each packing type should be followed scrupulously. 
If the proper procedures are not followed, the quality and reliability of 
devices may be degraded. This section describes general precautions for 
handling moisture-proof packing. Since the details may differ from device 
to device, refer also to the relevant individual datasheets or databook. 


(1) General precautions 


Follow the instructions printed on the device cartons regarding transportation and storage. 


© Do not drop or toss device packing. The laminated aluminum material in it can be rendered 
ineffective by rough handling. 


© The storage area temperature should be kept within a temperature range of 5°C to 30°C, and 


relative humidity should be maintained at 90% (max). Use devices within 12 months of the date 
marked on the package seal. 
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e |f the 12-month storage period has expired, or if the 30% humidity indicator shown in Figure 1 
is pink when the packing is opened, it may be advisable, depending on the device and packing 
type, to back the devices at high temperature to remove any moisture. Please refer to the table 
below. After the pack has been opened, use the devices in a 5°C to 30°C. 60% RH environment 
and within the effective usage period listed on the moisture-proof package. | f the effective usage 
period has expired, or if the packing has been stored in a high-humidity environment, back the 
devices at high temperature. 


| Packing Moisture removal | 


If the packing bears the “Heatproof” marking or indicates the maximum temperature which it can 
withstand, bake at 125°C for 20 hours. (Some devices require a different procedure.) 
Transfer devices to trays bearing the “Heatproof” marking or indicating the temperature which they 


can withstand, or to aluminum tubes before baking at 125°C for 20 hours. 
Deviced packed on tape cannot be baked and must be used within the effective usage period after 
unpacking, as specified on the packing. 


@ When baking devices, protect the devices from static electricity. 


© Moisture indicators can detect the approximate humidity level at a standard temperature of 
25°C. 6-point indicators and 3-point indicators are currently in use, but eventually all indicators 
will be 3-point indicators. 


HUMIDITY INDICATOR 


60% 


50% 


40% HUMIDITY INDICATOR 


READ AT LAVENDER READ AT LAVENDER 
BETWEEN PINK & BLUE BETWEEN PINK & BLUE 


(a) 6-point indicator (b) 3-point indicator 


30% 


DANGER IF PINK 
CHANGE DESICCANT 


20% 


DANGER IF PINK 


10% 


OOQOOOO 


Figure 1 Humidity indicator 
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3.3. Design 


Care must be exercised in the design of electronic equipment to achieve the desired reliability. It is 
important not only to adhere to specifications concerning absolute maximum ratings and 
recommended operating conditions, it is also important to consider the overall environment in 
which equipment will be used, including factors such as the ambient temperature, transient noise 
and voltage and current surges, as well as mounting conditions which affect device reliability. This 
section describes some general precautions which you should observe when designing circuits and 
when mounting devices on printed circuit boards. 


For more detailed information about each product family, refer to the relevant individual technical 
datasheets available from Toshiba. 


3.3.1 Absolute maximum ratings 


Do not use devices under conditions in which their absolute maximum ratings 

Ac AUTION (e.g. current, voltage, power dissipation or temperature) will be exceeded. A 
device may break down or its performance may be degraded, causing it to 
catch fire or explode resulting in injury to the user. 


The absolute maximum ratings are rated values which must not be 
exceeded during operation, even for an instant. Although absolute 
maximum ratings differ from product to product, they essentially 
concern the voltage and current at each pin, the allowable power 
dissipation, and the junction and storage temperatures. 


If the voltage or current on any pin exceeds the absolute maximum 
rating, the device's internal circuitry can become degraded. In the worst 
case, heat generated in internal circuitry can fuse wiring or cause the semiconductor chip to break 
down. 


If storage or operating temperatures exceed rated values, the package seal can deteriorate or the 
wires can become disconnected due to the differences between the thermal expansion coefficients 
of the materials from which the device is constructed. 


3.3.2 Recommended operating conditions 


The recommended operating conditions for each device are those necessary to guarantee that the 
device will operate as specified in the datasheet. 

If greater reliability is required, derate the device's absolute maximum ratings for voltage, current, 
power and temperature before using it. 


3.3.3 Derating 


When incorporating a device into your design, reduce its rated absolute maximum voltage, current, 
power dissipation and operating temperature in order to ensure high reliability. 

Since derating differs from application to application, refer to the technical datasheets available 
for the various devices used in your design. 


3.3.4 Unused pins 


If unused pins are left open, some devices can exhibit input instability problems, resulting in 
malfunctions such as abrupt increase in current flow. Similarly, if the unused output pins on a 
device are connected to the power supply pin, the ground pin or to other output pins, the |C may 
malfunction or break down. 
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3.3.5 


3.3.6 


3.3.7 


Since the details regarding the handling of unused pins differ from device to device and from pin 
to pin, please follow the instructions given in the relevant individual datasheets or databook. 


CMOS logic !C inputs, for example, have extremely high impedance. If an input pin is left open, it 
can easily pick up extraneous noise and become unstable. In this case, if the input voltage level 
reaches an intermediate level, it is possible that both the P-channel and N-channel transistors 
will be turned on, allowing unwanted supply current to flow. Therefore, ensure that the unused 
input pins of a device are connected to the power supply (Vcc) pin or ground (GND) pin of the same 
device. For details of what to do with the pins of heat sinks, refer to the relevant technical 
datasheet and databook. 


Latch-up 


Latch-up is an abnormal condition inherent in CMOS devices, in which Vcc gets shorted to ground. 
This happens when a parasitic PN-PN junction (thyristor structure) internal tothe CMOS chip is 
turned on, causing a large current of the order of several hundred mA or more to flow between Vcc 
and GND, eventually causing the device to break down. 


Latch-up occurs when the input or output voltage exceeds the rated value, causing a large current 
to flow in the internal chip, or when the voltage on the Vcc (Vdd) pin exceeds its rated value, 
forcing the internal chip into a breakdown condition. Once the chip falls into the latch-up state, 
even though the excess voltage may have been applied only for an instant, the large current 
continues to flow between Vcc (Vdd) and GND (Vss). This causes the device to heat up and, in 
extreme cases, to emit gas fumes as well. To avoid this problem, observe the following precautions: 


(1) Do not allow voltage levels on the input and output pins either to rise above Vcc (Vdd) or to 
fall below GND (Vss). Also, follow any prescribed power-on sequence, so that power is applied 
gradually or in steps rather than abruptly. 


(2) Donot allow any abnormal noise signals to be applied to the device. 
(3) Set the voltage levels of unused input pins to Vcc (Vdd) or GND (Vss). 


(4) Do not connect output pins to one another. 


Input/Output protection 


Wired-AND configurations, in which outputs are connected together, cannot be used, since this 
short-circuits the outputs. Outputs should, of course, never be connected to Vcc (Vdd) or GND 
(Vss). 


Furthermore, |Cs with tri-state outputs can undergo performance degradation if a shorted output 
current is allowed to flow for an extended period of time. Therefore, when designing circuits, make 
sure that tri-state outputs will not be enabled simultaneously. 


Load capacitance 


Some devices display increased delay times if the load capacitance is large. Also, large charging 
and discharging currents will flow in the device, causing noise. Furthermore, since outputs are 
shorted for a relatively long time, wiring can become fused. 


Consult the technical information for the device being used to determine the recommended load 
capacitance. 
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3.3.8 Thermal design 


The failure rate of semiconductor devices is greatly increased as operating temperatures increase. 
As shown in Figure 2, the internal thermal stress on a device is the sum of the ambient 
temperature and the temperature rise due to power dissipation in the device. Therefore, to 
achieve optimum reliability, observe the following precautions concerning thermal design: 


(1) Keep the ambient temperature (Ta) as low as possible. 


(2) If the device’s dynamic power dissipation is relatively large, select the most appropriate 
circuit board material, and consider the use of heat sinks or of forced air cooling. Such 
measures will help lower the thermal resistance of the package. 


(3) Derate the device's absolute maximum ratings to minimize thermal stress from power 
dissipation. 
@ja =6jc +Oca 
@ja =(Tj-Ta) /P 
Qjc =(Tj-To) / P 
6ca =(Tc-Ta) /P 
in which 6ja =thermal resistance between junction and surrounding air (°C/W) 
@jc =thermal resistance between junction and package surface, or internal thermal 
resistance (°C/W) 
@ca =thermal resistance between package surface and surrounding air, or external 
thermal resistance (°C/W) 
Tj =junction temperature or chip temperature (°C) 
Tc =package surface temperature or case temperature (°C) 
Ta =ambient temperature (°C) 
P =power dissipation (W) 


Ta 
O 
6ca 
Tc 
O 
O 


Figure 2 Thermal resistance of package 


3.3.9 Interfacing 


When connecting inputs and outputs between devices, make sure input voltage (VIL/VIH) and 
output voltage (VOL/VoH) levels are matched. Otherwise, the devices may malfunction. When 
connecting devices operating at different supply voltages, such as in a dual-power-supply system, 
be aware that erroneous power-on and power-off sequences can result in device breakdown. F or 
details of how to interface particular devices, consult the relevant technical datasheets and 
databooks. If you have any questions or doubts about interfacing, contact your nearest Toshiba 
office or distributor. 
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3.3.10 Decoupling 


Spike currents generated during switching can cause Vcc (Vdd) and GND (Vss) voltage levels to 
fluctuate, causing ringing in the output waveform or a delay in response speed. (The power supply 
and GND wiring impedance is normally 50 Q to 100 Q.) For this reason, the impedance of power 
supply lines with respect to high frequencies must be kept low. This can be accomplished by using 
thick and short wiring for the Vcc (Vdd) and GND (Vss) lines and by installing decoupling 
capacitors (of approximately 0.01 uF to 1 uF capacitance) as high-frequency filters between Vcc 
(Vdd) and GND (Vss) at strategic locations on the printed circuit board. 


For low-frequency filtering, it is a good idea to install a 10- to 100-uF capacitor on the printed 
circuit board (one capacitor will suffice). If the capacitance is excessively large, however, (e.g. 
several thousand uF ) latch-up can be a problem. Be sure to choose an appropriate capacitance 
value. 


An important point about wiring is that, in the case of high-speed logic! Cs, noise is caused mainly 
by reflection and crosstalk, or by the power supply impedance. Reflections cause increased signal 
delay, ringing, overshoot and undershoot, thereby reducing the device's safety margins with 
respect to noise. To prevent reflections, reduce the wiring length by increasing the device 
mounting density so as to lower the inductance (L) and capacitance (C) in the wiring. Extreme 
care must be taken, however, when taking this corrective measure, since it tends to cause 
crosstalk between the wires. In practice, there must be a trade-off between these two factors. 


3.3.11. External noise 


Printed circuit boards with long 1/O or signal pattern lines are 
vulnerable to induced noise or surges from outside sources. 
Consequently, malfunctions or breakdowns can result from 
overcurrent or overvoltage, depending on the types of device 
used. To protect against noise, lower the impedance of the 
pattern line or insert a noise-canceling circuit. Protective 
measures must also be taken against surges. 


«Input/Output 
‘Signals V 
For details of the appropriate protective measures for a 


particular device, consult the relevant databook. 


3.3.12 Electromagnetic interference 


Widespread use of electrical and electronic equipment in recent years has brought with it radio 
and TV reception problems due to electromagnetic interference. To use the radio spectrum 
effectively and to maintain radio communications quality, each country has formulated 
regulations limiting the amount of electromagnetic interference which can be generated by 
individual products. 


Electromagnetic interference includes conduction noise propagated through power supply and 
telephone lines, and noise from direct electromagnetic waves radiated by equipment. Different 
measurement methods and corrective measures are used to assess and counteract each specific 
type of noise. 


Difficulties in controlling electromagnetic interference derive from the fact that there is no 
method available which allows designers to calculate, at the design stage, the strength of the 
electromagnetic waves which will emanate from each component in a piece of equipment. For this 
reason, it is only after the prototype equipment has been completed that the designer can take 
measurements using a dedicated instrument to determine the strength of electromagnetic 
interference waves. Yet it is possible during system design to incorporate some measures for the 
prevention of electromagnetic interference, which can facilitate taking corrective measures once 
the design has been completed. These include installing shields and noise filters, and increasing 
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the thickness of the power supply wiring patterns on the printed circuit board. One effective 
method, for example, is to devise several shielding options during design, and then select the most 
suitable shielding method based on the results of measurements taken after the prototype has 
been completed. 


3.3.13 Peripheral circuits 


In most cases semiconductor devices are used with peripheral circuits and components. The input 
and output signal voltages and currents in these circuits must be chosen to match the 
semiconductor device's specifications. The following factors must be taken into account. 


(1) Inappropriate voltages or currents applied to a device's input pins may cause it to operate 
erratically. Some devices contain pull-up or pull-down resistors. When designing your system, 
remember to take the effect of this on the voltage and current levels into account. 


(2) The output pins on a device have a predetermined external circuit drive capability. If this 
drive capability is greater than that required, either incorporate a compensating circuit into 
your design or carefully select suitable components for use in external circuits. 


3.3.14 Safety standards 


Each country has safety standards which must be observed. These safety standards include 
requirements for quality assurance systems and design of device insulation. Such requirements 
must be fully taken into account to ensure that your design conforms to the applicable safety 
standards. 


3.3.15 Other precautions 


(1) When designing a system, be sure to incorporate fail-safe and other appropriate measures 
according to the intended purpose of your system. Also, be sure to debug your system under 
actual board-mounted conditions. 


(2) If aplastic-package device is placed in a strong electric field, surface leakage may occur due to 
the charge-up phenomenon, resulting in device malfunction. In such cases take appropriate 
measures to prevent this problem, for example by protecting the package surface with a 
conductive shield. 


(3) With some microcomputers and MOS memory devices, caution is required when powering on 
or resetting the device. To ensure that your design does not violate device specifications, 
consult the relevant databook for each constituent device. 


(4) Ensure that no conductive material or object (such as a metal pin) can drop onto and short the 
leads of a device mounted on a printed circuit board. 


3.4 Inspection, Testing and Evaluation 
3.4.1 Grounding 
Ground all measuring instruments, jigs, tools and soldering irons to earth. 


AL CAUTION Electrical leakage may cause a device to break down or may result in electric 
shock. 
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3.4.2 


3.5 


3.5.1 


Inspection Sequence 


® Do not insert devices in the wrong orientation. Make sure that the positive 
4KCAUTION = and negative electrodes of the power supply are correctly connected. 
Otherwise, the rated maximum current or maximum power dissipation 
may be exceeded and the device may break down or undergo performance 
degradation, causing it to catch fire or explode, resulting in injury to the 
user. 

@ When conducting any kind of evaluation, inspection or testing using AC 
power with a peak voltage of 42.4 V or DC power exceeding 60 V, be sure to 
connect the electrodes or probes of the testing equipment to the device 
under test before powering it on. Connecting the electrodes or probes of 
testing equipment to a device while it is powered on may result in electric 
shock, causing injury. 


(1) Apply voltage to the test jig only after inserting the device securely into it. When applying or 
removing power, observe the relevant precautions, if any. 


(2) Make sure that the voltage applied to the device is off before removing the device from the 
test jig. Otherwise, the device may undergo performance degradation or be destroyed. 


(3) Make sure that no surge voltages from the measuring equipment are applied to the device. 


(4) The chips housed in tape carrier packages (TCPs) are bare chips and are therefore exposed. 
During inspection take care not to crack the chip or cause any flaws in it. 
Electrical contact may also cause a chip to become faulty. Therefore make sure that nothing 
comes into electrical contact with the chip. 


Mounting 


There are essentially two main types of semiconductor device package: lead insertion and surface 
mount. During mounting on printed circuit boards, devices can become contaminated by flux or 
damaged by thermal stress from the soldering process. With surface-mount devices in particular, 
the most significant problem is thermal stress from solder reflow, when the entire package is 
subjected to heat. This section describes a recommended temperature profile for each mounting 
method, as well as general precautions which you should take when mounting devices on printed 
circuit boards. Note, however, that even for devices with the same package type, the appropriate 
mounting method varies according to the size of the chip and the size and shape of the lead frame. 
Therefore, please consult the relevant technical datasheet and databook. 


Lead forming 


® Always wear protective glasses when cutting the leads of a device with 
clippers or a similar tool. If you do not, small bits of metal flying off the cut 
4A.CAUTION — ends may damage your eyes. 
®@ Do not touch the tips of device leads. Because some types of device have 
leads with pointed tips, you may prick your finger. 


Semiconductor devices must undergo a process in which the leads are cut and formed before the 
devices can be mounted on a printed circuit board. If undue stress is applied to the interior of a 
device during this process, mechanical breakdown or performance degradation can result. This is 
attributable primarily to differences between the stress on the device's external leads and the 
stress on the internal leads. If the relative difference is great enough, the device's internal leads, 
adhesive properties or sealant can be damaged. Observe these precautions during the lead- 
forming process (this does not apply to surface-mount devices): 
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3.5.2 


3.5.3 


(1) Lead insertion hole intervals on the printed circuit board should match the lead pitch of the 
device precisely. 


(2) If lead insertion hole intervals on the printed circuit board do not precisely match the lead 
pitch of the device, do not attempt to forcibly insert devices by pressing on them or by pulling 
on their leads. 


(3) For the minimum clearance specification between a device and a 
printed circuit board, refer to the relevant device’s datasheet and 
databook. If necessary, achieve the required clearance by forming 
the device's leads appropriately. Do not use the spacers which are 
used to raise devices above the surface of the printed circuit board 
during soldering to achieve clearance. These spacers normally 
continue to expand due to heat, even after the solder has begun to solidify; this applies severe 
stress to the device. 


(4) Observe the following precautions when forming the leads of a device prior to mounting. 


© Use a tool or jig to secure the lead at its base (where the lead meets the device package) while 
bending so as to avoid mechanical stress to the device. Also avoid bending or stretching device 
leads repeatedly. 


© Be careful not to damage the lead during lead forming. 


© Follow any other precautions described in the individual datasheets and databooks for each 
device and package type. 


Socket mounting 


(1) When socket mounting devices on a printed circuit board, use sockets which match the 
inserted device’s package. 


(2) Use sockets whose contacts have the appropriate contact pressure. If the contact pressure is 
insufficient, the socket may not make a perfect contact when the device is repeatedly inserted 
and removed; if the pressure is excessively high, the device leads may be bent or damaged 
when they are inserted into or removed from the socket. 


(3) When soldering sockets to the printed circuit board, use sockets whose construction prevents 
flux from penetrating into the contacts or which allows flux to be completely cleaned off. 


(4) Make sure the coating agent applied to the printed circuit board for moisture-proofing 
purposes does not stick to the socket contacts. 


(5) If the device leads are severely bent by a socket as it is inserted or removed and you wish to 
repair the leads so as to continue using the device, make sure that this lead correction is only 
performed once. Do not use devices whose leads have been corrected more than once. 


(6) If the printed circuit board with the devices mounted on it will be subjected to vibration from 
external sources, use sockets which have a strong contact pressure so as to prevent the 
sockets and devices from vibrating relative to one another. 


Soldering temperature profile 


The soldering temperature and heating time vary from device to device. Therefore, when 
specifying the mounting conditions, refer to the individual datasheets and databooks for the 
devices used. 
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(1) Using asoldering iron 


Complete soldering within ten seconds for lead temperatures of up to 260°C, or within three 
seconds for lead temperatures of up to 350°C. 


(2) Using medium infrared ray reflow 


® Heating top and bottom with long or medium infrared rays is recommended (see Figure 3). 


Medium infrared ray heater 
reflow) 


Product flow 
——$§$§»> 


Long infrared ray heater (preheating) 


Figure 3 Heating top and bottom with long or medium infrared rays 


© Complete the infrared ray reflow process within 30 seconds at a package surface temperature of 
between 210°C and 240°C. 


© Refer to Figure 4 for an example of a good temperature profile for infrared or hot air reflow. 


(°C) 
240 


210 


Package surface temperature 


. 


> .¢ 

60-120 : :30 H 

seconds ; ‘seconds : 
or less 


Time (in seconds) 
Figure 4 Sample temperature profile for infrared or hot air reflow 
(3) Using hot air reflow 


® Complete hot air reflow within 30 seconds at a package surface temperature of between 210°C 
and 240°C. 


© For an example of a recommended temperature profile, refer to Figure 4 above. 
(4) Using solder flow 
© Apply preheating for 60 to 120 seconds at a temperature of 150°C. 


© For lead insertion-type packages, complete solder flow within 10 seconds with the 
temperature at the stopper (or, if there is no stopper, at a location more than 1.5 mm from 
the body) which does not exceed 260°C. 
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3.5.4 


© For surface-mount packages, complete soldering within 5 seconds at a temperature of 250°C or 
less in order to prevent thermal stress in the device. 


e Figure 5 shows an example of a recommended temperature profile for surface-mount packages 
using solder flow. 


(2) 


(°C) 
250 


Package surface temperature 


- 
60-120 seconds ;: 5seconds 
: or less 


Time (in seconds) 


Figure 5 Sample temperature profile for solder flow 


Flux cleaning and ultrasonic cleaning 


When cleaning circuit boards to remove flux, make sure that no residual reactive ions such as 
Na or Cl remain. Note that organic solvents react with water to generate hydrogen chloride 
and other corrosive gases which can degrade device performance. 


Washing devices with water will not cause any problems. However, make sure that no 
reactive ions such as sodium and chlorine are left as a residue. Also, be sure to dry devices 
sufficiently after washing. 


Do not rub device markings with a brush or with your hand during cleaning or while the 
devices are still wet from the cleaning agent. Doing so can rub off the markings. 


The dip cleaning, shower cleaning and steam cleaning processes all involve the chemical 
action of a solvent. Use only recommended solvents for these cleaning methods. When 
immersing devices in a solvent or steam bath, make sure that the temperature of the liquid is 
50°C or below, and that the circuit board is removed from the bath within one minute. 


Ultrasonic cleaning should not be used with hermetically-sealed ceramic packages such as a 
leadless chip carrier (LCC), pin grid array (PGA) or charge-coupled device (CCD), because the 
bonding wires can become disconnected due to resonance during the cleaning process. Even if 
a device package allows ultrasonic cleaning, limit the duration of ultrasonic cleaning to as 
short a time as possible, since long hours of ultrasonic cleaning degrade the adhesion between 
the mold resin and the frame material. The following ultrasonic cleaning conditions are 
recommended: 


Frequency: 27 kHz ~ 29 kHz 
Ultrasonic output power: 300 W or less (0.25 W/cm? or less) 
Cleaning time: 30 seconds or less 


Suspend the circuit board in the solvent bath during ultrasonic cleaning in such a way that 
the ultrasonic vibrator does not come into direct contact with the circuit board or the device. 
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3.5.5 


3.5.6 


3.5.7 


No cleaning 


If analog devices or high-speed devices are used without being cleaned, flux residues may cause 
minute amounts of leakage between pins. Similarly, dew condensation, which occurs in 
environments containing residual chlorine when power to the device is on, may cause between- 
lead leakage or migration. Therefore, Toshiba recommends that these devices be cleaned. 
However, if the flux used contains only a small amount of halogen (0.05W% or less), the devices 
may be used without cleaning without any problems. 


(1) 


Mounting tape carrier packages (TCPs) 


When tape carrier packages (TCPs) are mounted, measures must be taken to prevent 
electrostatic breakdown of the devices. 


If devices are being picked up from tape, or outer lead bonding (OLB) mounting is being 
carried out, consult the manufacturer of the insertion machine which is being used, in order 
to establish the optimum mounting conditions in advance and to avoid any possible hazards. 


The base film, which is made of polyimide, is hard and thin. Be careful not to cut or scratch 
your hands or any objects while handling the tape. 


When punching tape, try not to scatter broken pieces of tape too much. 


Treat the extra film, reels and spacers left after punching as industrial waste, taking care not 
to destroy or pollute the environment. 


Chips housed in tape carrier packages (TCPs) are bare chips and therefore have their reverse 
side exposed. To ensure that the chip will not be cracked during mounting, ensure that no 
mechanical shock is applied to the reverse side of the chip. Electrical contact may also cause a 
chip to fail. Therefore, when mounting devices, make sure that nothing comes into electrical 
contact with the reverse side of the chip. 

If your design requires connecting the reverse side of the chip to the circuit board, please 
consult Toshiba or a Toshiba distributor beforehand. 


Mounting chips 


Devices delivered in chip form tend to degrade or break under external forces much more easily 
than plastic-packaged devices. Therefore, caution is required when handling this type of device. 


(1) 


(2) 


(3) 


Mount devices in a properly prepared environment so that chip surfaces will not be exposed to 
polluted ambient air or other polluted substances. 


When handling chips, be careful not to expose them to static electricity. 

In particular, measures must be taken to prevent static damage during the mounting of chips. 
With this in mind, Toshiba recommend mounting all peripheral parts first and then mounting 
chips last (after all other components have been mounted). 


Make sure that PCBs (or any other kind of circuit board) on which chips are being mounted do 
not have any chemical residues on them (such as the chemicals which were used for etching 
the PCBs). 


When mounting chips on a board, use the method of assembly that is most suitable for 
maintaining the appropriate electrical, thermal and mechanical properties of the 
semiconductor devices used. 


* For details of devices in chip form, refer to the relevant device's individual datasheets. 
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3.5.8 Circuit board coating 


When devices are to be used in equipment requiring a high degree of reliability or in extreme 
environments (where moisture, corrosive gas or dust is present), circuit boards may be coated for 
protection. However, before doing so, you must carefully consider the possible stress and 
contamination effects that may result and then choose the coating resin which results in the 
minimum level of stress to the device. 


3.5.9 Heat sinks 


(1) When attaching a heat sink to a device, be careful not to apply excessive force to the device in 
the process. 


(2) When attaching a device to a heat sink by fixing it at two or more locations, evenly tighten all 
the screws in stages (i.e. do not fully tighten one screw while the rest are still only loosely 
tightened). Finally, fully tighten all the screws up to the specified torque. 


(3) Drill holes for screws in the heat sink exactly as specified. Smooth the 
surface by removing burrs and protrusions or indentations which might 
interfere with the installation of any part of the device. 


(4) A coating of silicone compound can be applied between the heat sink and 
the device to improve heat conductivity. Be sure to apply the coating 
thinly and evenly; do not use too much. Also, be sure to use a non-volatile 
compound, as volatile compounds can crack after a time, causing the heat 
radiation properties of the heat sink to deteriorate. 


(5) If the device is housed in a plastic package, use caution when selecting the type of silicone 
compound to be applied between the heat sink and the device. With some types, the base oil 
separates and penetrates the plastic package, significantly reducing the useful life of the 
device. 

Two recommended silicone compounds in which base oil separation is not a problem are 
Y G6260 from Toshiba Silicone. 


(6) Heat-sink-equipped devices can become very hot during operation. Do not touch them, or you 
may sustain a burn. 


3.5.10 Tightening torque 


(1) Make sure the screws are tightened with fastening torques not exceeding the torque values 
stipulated in individual datasheets and databooks for the devices used. 


(2) Do not allow a power screwdriver (electrical or air-driven) to touch devices. 


3.5.11. Repeated device mounting and usage 


Do not remount or re-use devices which fall into the categories listed below; these devices may 
cause significant problems relating to performance and reliability. 


(1) Devices which have been removed from the board after soldering 


(2) Devices which have been inserted in the wrong orientation or which have had reverse current 
applied 


(3) Devices which have undergone lead forming more than once 
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3.6 Protecting Devices in the Field 


3.6.1 Temperature 


Semiconductor devices are generally more sensitive to temperature than are other electronic 
components. The various electrical characteristics of a semiconductor device are dependent on the 
ambient temperature at which the device is used. It is therefore necessary to understand the 
temperature characteristics of a device and to incorporate device derating into circuit design. Note 
also that if a device is used above its maximum temperature rating, device deterioration is more 
rapid and it will reach the end of its usable life sooner than expected. 


3.6.2 Humidity 


Resin-molded devices are sometimes improperly sealed. When these devices are used for an 
extended period of time in a high-humidity environment, moisture can penetrate into the device 
and cause chip degradation or malfunction. Furthermore, when devices are mounted on a regular 
printed circuit board, the impedance between wiring components can decrease under high- 
humidity conditions. In systems which require a high signal-source impedance, circuit board 
leakage or leakage between device lead pins can cause malfunctions. The application of a 
moisture-proof treatment to the device surface should be considered in this case. On the other 
hand, operation under low-humidity conditions can damage a device due to the occurrence of 
electrostatic discharge. Unless damp-proofing measures have been specifically taken, use devices 
only in environments with appropriate ambient moisture levels (i.e. within a relative humidity 
range of 40% to 60%). 


3.6.3 Corrosive gases 


Corrosive gases can cause chemical reactions in devices, degrading device characteristics. 

For example, sulphur-bearing corrosive gases emanating from rubber placed near a device 
(accompanied by condensation under high-humidity conditions) can corrode a device's leads. The 
resulting chemical reaction between leads forms foreign particles which can cause electrical 
leakage. 


3.6.4 Radioactive and cosmic rays 


Most industrial and consumer semiconductor devices are not designed with protection against 
radioactive and cosmic rays. Devices used in aerospace equipment or in radioactive environments 
must therefore be shielded. 


3.6.5 Strong electrical and magnetic fields 


Devices exposed to strong magnetic fields can undergo a polarization phenomenon in their 

plastic material, or within the chip, which gives rise to abnormal symptoms such as impedance 
changes or increased leakage current. Failures have been reported in LSIs mounted near 
malfunctioning deflection yokes in TV sets. In such cases the device's installation location must be 
changed or the device must be shielded against the electrical or magnetic field. Shielding against 
magnetism is especially necessary for devices used in an alternating magnetic field because of the 
electromotive forces generated in this type of environment. 
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3.6.6 


3.6.7 


3.6.8 


3.7 


Interference from light (ultraviolet rays, sunlight, fluorescent lamps and 
incandescent lamps) 


Light striking a semiconductor device generates electromotive force due to photoelectric effects. In 
some cases the device can malfunction. This is especially true for devices in which the internal 
chip is exposed. When designing circuits, make sure that devices are protected against incident 
light from external sources. This problem is not limited to optical semiconductors and EPROMs. 
All types of device can be affected by light. 


Dust and oil 


J ust like corrosive gases, dust and oil can cause chemical reactions in devices, which will 
adversely affect a device's electrical characteristics. To avoid this problem, do not use devices in 
dusty or oily environments. This is especially important for optical devices because dust and oil 
can affect a device's optical characteristics as well as its physical integrity and the electrical 
performance factors mentioned above. 


Fire 


Semiconductor devices are combustible; they can emit smoke and catch fire if heated sufficiently. 
When this happens, some devices may generate poisonous gases. Devices should therefore never 
be used in close proximity to an open flame or a heat-generating body, or near flammable or 
combustible materials. 


Disposal of devices and packing materials 


When discarding unused devices and packing materials, follow all procedures specified by local 
regulations in order to protect the environment against contamination. 
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Precautions and Usage Considerations 


This section describes matters specific to each product group which need to be taken into 
consideration when using devices. If the same item is described in Sections 3 and 4, the 
description in Section 4 takes precedence. 


Microcontrollers 


Design 


(1) Using resonators which are not specifically recommended for use 


Resonators recommended for use with Toshiba products in microcontroller oscillator applications 
are listed in Toshiba databooks along with information about oscillation conditions. If you use a 
resonator not included in this list, please consult Toshiba or the resonator manufacturer 
concerning the suitability of the device for your application. 


(2) Undefined functions 


In some microcontrollers certain instruction code values do not constitute valid processor 
instructions. Also, it is possible that the values of bits in registers will become undefined. Take 
care in your applications not to use invalid instructions or to let register bit values become 
undefined. 


TOSHIBA 4 Precautions and Usage Considerations 
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1. Introduction 


This user’s manual describes the C790 superscalar microprocessor for the system designer, 
paying special attention to the software interface and the bus interface. 


The C790 is a superscalar integrated implementation of the subset of the 64-bit MIPS IV 
Instruction Set Architecture. It also implements a large extension to this instruction set 
specially tailored for multimedia applications. It contains a CPU, a floating point 
execution unit (Coprocessor 1), primary instruction and data caches. 


Two instructions can be decoded each cycle. These instructions are issued in-order and are 
always completed in-order!. Data cache misses are non-blocking. A single outstanding 
cache miss does not stall the pipeline, so that load misses or uncached loads are retired 
out-of-order. Multiply, Multiply-Accumulate, Divide, Prefetch, and Coprocessor 1 
instructions are also retired out-of-order. 


’ However, some instructions are retired out-of-order. 
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1.1 Features 


The C790 core has the following features: 
e 2-way superscalar pipeline 
e 128-bit (two 64-bit) data path and 128-bit system bus 
e Instruction set architecture 


e 64-bit MIPS III instruction set implementation (except LL, SC, LLD and 
SCD) 


e Selected MIPS IV instruction set implementation (Prefetch and Move 
conditional instructions) 


e Threeoperand Multiply and Multiply-Accumulate instructions 
e = 128-bit (Quadword) load/store instructions 


e =128-bit multimedia instructions which configure the 128-bit data path as two 
64-bit, four 32-bit, eight 16-bit or sixteen 8-bit paths 


e Configurable Endianness 


e Branch prediction with Branch History Table (BHT) and Branch Target Address 
Cache (BTAC) 


e Large on-chip caches 
e Instruction cache: 32KB, 2-way set associative 
e Data cache: 32KB, 2-way set-associative (with write-back protocol) 
e Non-blocking load, hit under miss and early restart on first quadword 
e Data cache line locking 
e Prefetch functions 
e 64Bytecacheline 
e Fast integer Multiply and Multiply-Accumulate operations 
e Memory management unit 
e 48-entry (96 pages) fully associative translation look-aside buffer (TLB) 
e 32-bit physical address space and 32-bit virtual address space 
e |EEE754-1985 compatible FPU (MIPS III 1SA supported) 
e Performance counters supported 
e Debug support 
e Multi-stepping of instruction execution 
e Hardware breakpoint on instruction addresses 
e Hardware breakpoint on data address and data value 
e PC tracing capability 
e 128-bit demultiplexed data bus and 32-bit address bus 
e Pipelined addresses 
e Bus error supported 
e Multiple masters supported 
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1.2 Related Documents 


The following documents should be referenced: 
[1] MIPS R4000 Microprocessor User’s Manual 
[2] MIPS R10000 Microprocessor User’s Manual 
[3] MIPS IV Instruction Set (Revision 3.2) 
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1.3 Revision History 
Rev. 1.0: June 24", 1999 


Rev. 1.1: December 25", 1999 


Add |EEE754 compatible F PU feature (both single- and double-precision) 


Rev. 1.2: March , 2000 
Publish 


Rev. 2.0: April , 2001 
Fixed a lot of typo 
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1.4 Conventions Used in This Manual 
The names of registers, fields, and instructions are /talicized as in this example: 


The Status register (SR) is a read/write register that contains the operating mode, 
interrupt enabling, and diagnostic states of the processor. 


When a name is first introduced, it is shown in bold type. 
Ranges are denoted by a colon as in the following example: 


The 4-bit Coprocessor Usability (CU[3:0]) field controls the usability of four possible 
coprocessor s. 


Conventions used in instruction descriptions are defined at the beginning of Appendices A, 
B,C, and D. 
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1.5 Restrictions for Use of the C790 CPU Core 


1. Revision History 


| Revision | Date __| Contents 
4/2/2001 FLX01-FLX06; Restrictions for User's Manual Rev.2.0 


Items 1 through 6 in the description below are the restrictions that must be obeyed 
when using the C790 CPU core (User's Manual Rev.2.0). 


Table 1-1. Restriction List 


| ID Contents 
FLX01 TLB exceptions masks bus errors. 


Bus errors are masked when Status.ERL==1 or Status.EXL = 1. 


kuseg becomes an uncached area when an error exception (Status.ERL = 1) occurs. 
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2. Description 
2.1 TLB exceptions mask bus errors (FLX01) 


2.1.1 Phenomenon 
There are cases in which TLB exceptions occurring immediately after a bus error 
mask the bus error and the bus error can not be detected. 

2.1.2 Corrective measures 


This is caused by bus error exceptions having a lower priority than TLB 
exceptions in instruction fetch and data access (refer to “5.5.1 Exception Priority”). 
Check the followings when programming a TLB exception handler. 


1) Using the TLB exception handler, check for occurrence of any bus error 
exceptions before a page refill. 


2) Using the TLB exception handler, check for occurrence of any bus error 
exceptions if a page that should be refilled is incorrect. 


3) Using the TLB_ exception handler, execute at Status.EXL=—=0 and 
Status.ERL=0 after the TLB exception handler stores to EPC, Cause, and 
Status registers. 


Pending bus errors can be confirmed by referring to Status.BEM. 
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2.2 Bus errors are masked when Status.ERL==1 or Status.EXL = 1 (FLX02) 


2.2.1 Phenomenon 


Even if a bus error occurs during instruction fetch in an exception handler 
(Status.E XL=1 or Status.ERL=1), the CPU does not accept the exception and 
executes instruction code with indeterminate values read from the bus. 


2.2.2 Corrective measures 


This is caused by bus error exceptions being masked by Status.EXL==1 or 
Status.ERL=1. Do not cause exceptions due to instruction fetch in 
Status.E XL==1 or Status.E RL ==1. Generating exceptions in an exception handler 
is dangerous. F or example: 


1) TheJ R instruction may potentially cause an address error or a bus error. Do 
not use] R instruction in Status.E XL==1 or Status.ERL=1. 


2) A mapped region may potentially cause a TLB exception. Be sure to execute 
using an unmapped region like that below: 
0x8000_0000 - OxOFFF_FFFF: ksegO 
OxA000_0000 - OxBFFF_FFFF: ksegl 


TOSHIBA 
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2.3 AdEL occurs in index-type ICACHE or BTAC CACHE instructions (FLX03) 


2.3.1 


2.3.2 


Phenomenon 


When executing index-type CACHE instructions below in either the User mode or 
Supervisor mode, operation occasionally becomes undefined and generates AdEL 
(Address Error exception; load and inst fetch). 


There are five index-type |CACHE sub operations as listed below. 


00111 CACHE IXIN I$ index invalidate 
00000 CACHE IXLTG 1$ index load tag 
00100 CACHE IXSTG 1$ index store tag 
00001 CACHE IXLDT 1$ index load data 
00101 CACHE |IXSDT 1$ index store data 


There are four BTAC CACHE sub operations as listed below. 
00010 CACHE BXLBT index load BTAC 
00110 CACHE BXSBT index store BTAC 
01100 CACHE BFH BTAC flush 
01010 CACHE BHINBT hit invalidate BTAC 


However, there is no problem when Status.K SU Kernel. Please note that 
Status.KSU —Kernel includes the kernel mode at Status.E XL=1 or 
Status.E RL=1 as well. There is also no problem when Status.CU[0]—=0, and 
Status.K SU =U ser mode or Supervisor mode. 


Corrective measures 


In Status.CU[0]==1 and Status.K SU Supervisor or User, execute under 
VA[31]—0 when executing either index-type|CACHE or BTAC CACHE 
instructions. VA here represents base reg + offset. 
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2.4 kuseg becomes an uncached area when an error’ exception 
(Status.ERL = 1) occurs (FLX04) 


2.4.1. Phenomenon 
There are cases in which kuseg (Ox0000_0000 - Ox7FFF_FFFF) becomes 
uncached in an error exception handler (Status.E RL==1) and data consistency 
with cached area (kseg, ksseg, kseg0) is lost. 


2.4.2 Corrective measures 


In an error exception handler (Status.E RL==1), when accessing kuseg 
(Ox0000_0000 - Ox7FFF_FFFF), access it after guarding using SYNC.L as follows: 
SYNC.L 
SW ku_ seg 
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2.5 First two instructions in an exception handler are executed as NOP when a 
bus error occurs (FLX05) 


2.5.1 Phenomenon 


There are cases in which the first two instructions in an exception handler are 
executed as NOP instructions, when certain exception occurs and then a bus error 
occurs immediately before jumping to the exception handler. 


2.5.2 Corrective measures 
Place NOP in the first two instruction locations in all exception handlers. 
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2.6 Unexpected instruction-fetch bus-errors occur when executing a Crashme 
program (FLX06) 
2.6.1 Phenomenon 


In Kernerl mode or Supervisor mode, unexpected I nstruction-fetch bus errors 
occur when attempting to execute a program called "Crashme" of Linux, since 
prohibited instruction-sequences that do not obey the following programming 
restrictions are executed. 


In User mode, such a phenomenon doesn’t occur. 


2.6.2 Corrective measures 


In Kernerl mode or Supervisor mode , obey the following programming 
restrictions: 


1) Any CACHE instruction must not be placed in a branch delay slot. 


2) SYNC.P must be located immediately before or immediately after any 
CACHE instruction. 
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2. Architecture Overview 


This chapter includes an overview of the C790 architecture. It discusses the following 
items: 


Block diagram and main modules 
Superscalar pipeline operation 
Instruction set 

Registers 

Memory Management 

Cache Memory 

Bus interface 

Floating Point Unit 

Performance Monitors 

Debug Support 
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2.1 Block Diagram and Functional Block Descriptions 


This section presents a block diagram of the main modules of the C790 and summarizes 
the modules. 


2.1.1 Instruction 
Virtual Address 
PC Unit (IVA) 219 
: Instruction Cache (I-Cache) 
PC Pipe & Po Tag, BHT, Predecode, Inst RAMs 
BTAC ITLB (32 KB, 2-way set assoc.) 
(64-entry : 
fully assoc.) 2 entries Instruction 
Physical Address 24.4 |-Gache Output backs 
(IPA) we Pipeline 


Issue Logical Staging Resigters Control 


(2 Issue In-order) 


TLB Refill Bus 


48 entry TLB 
Cop0 Registers | 


Operand/Bypass Logic 


I 
I 
I 
rs Virtual Address FPR 
oO Computation Logic f (32x64-bit wide 
e ‘ registers 
2 Data Virtual Address i ) 
3 (DVA) 
g \ 
lw I 
n As 2.1.2 
a I 
Data Cache DTLB ' 
(D-Cache) (4 entries) ' 
I 
Data | 
ene ey Physical | 


Address | 


BR Execution Pipe 

1 Execution Pipe 

10 Execution Pipe 

C1 COP1 (FPU) Pipe 


Result and Move Buses 2.1.10 128b 


BIU Bus 


set assoc.) 7 
2.1.11 


Bus Interface Unit 


128b 


128b 


CPU Bus 


Figure 2-1. C790 Block Diagram 
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2.1.1. PC Unit 


The 32-bit Program Counter (PC) holds the address of the instruction which is being 
executed. It also contains a 64-entry Branch Target Address Cache (BTAC) which stores 
branch target addresses used during branch prediction. 


2.1.2 MMU 


The Memory Management Unit supports the address translation functions of the CPU. It 
supplies the DTLB (Data Translation Lookaside Buffer) and ITLB_ (Instruction 
Translation Lookaside Buffer) with data via the TLB Refill Bus. Usage of these buffers is 
described in chapter 6. 


2.1.3 Caches 


Operation of the Instruction Cache and the Data Cache is described in Chapter 7. For 
each branch instruction, present in the instruction cache, two bits of branch history are 
stored in the Branch History Table (BHT). 


2.1.4 Issue Logic and Staging Registers 


The issue logic decides how to route instructions to appropriate pipes. It issues up to 2 
instructions every cycle. Routing is described and discussed later in section 2.2. 


2.1.5 GPR (General Purpose Registers) and FPR (Floating-Point 
Registers) 


The General-Purpose Registers and the Floating-Point Registers are discussed in Section 
2S, 
2.1.6 The Five Execution Pipes 


2.1.6.1 10 and 1/1 Pipes 


There are two integer ALU pipelines (10 and |1), each of which contains a complete 64-bit 
ALU, Shifter and Multiply-Accumulate unit. The 10 pipeline contains the SA register used 
for funnel shift operations. The two 64-bit ALU pipelines can be configured dynamically 
(on an instruction-by-instruction basis) into a single 128-bit execution pipeline to 
execute 128-bit Multimedia ALU, Shift and Multiply-Accumulate instructions. 
Furthermore, the two ALU pipelines share a single 128-bit multimedia aligner. 


2.1.6.2 LS -Load/Store Pipe 


The Load/Store (LS) pipe contains logic to support a single 128-bit Load and Store 
instruction. 


2.1.6.3 BR- Branch Pipe 


The Branch (BR) pipe contains logic to implement a single Branch instruction including 
Branch comparators. 


2.1.6.4 C1-COP1/FPU Pipe 


The Cl pipe contains logic to support a single/double Floating Point coprocessor unit 
(COP 1). 
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2.1.7 Operand/Bypass logic 


This module takes data from the GPRs and from the Result and Move Buses, and routes 
the data to the pipelines. 


2.1.8 Response Buffer and Writeback Buffer 


The Writeback Buffer (WBB) is an 8 entry by 16 byte (one quadword) FIFO queuing up 
stores prior to accessing the CPU bus. It increases C790 performance by decoupling the 
processor from the latencies of the CPU bus. It is also used during the gathering operation 
of uncached accelerated stores; sequential stores less than a quadword in length are 
gathered in the WBB, thereby reducing bus bandwidth usage. 


2.1.9 UCAB 


The Uncached Accelerated Buffer (UCAB) is a 1 entry by 8 quadword buffer. It caches 128 
sequential bytes of data during an uncached accelerated load miss. Subsequent loads from 
the uncached accelerated address space get their data from this buffer if the address hits 
in the UCAB, thereby eliminating bus latencies and providing higher performance. 


2.1.10 Result and Move Buses 


The Result and Move Buses convey data between execution units, the data cache, and the 
Operand/Bypass Logic unit. 


2.1.11 Bus Interface Unit and BIU Bus 


The BIU connects the core to the rest of the system. It interfaces the core’s internal bus 
signals to the CPU Bus. 
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2.2 Superscalar Pipeline Operation 


The C790 has a six-stage superscalar pipeline. It can fetch, decode and execute a 
maximum of two instructions in parallel each cycle. 


This section discusses in more detail the six execution pipelines listed in Section 2.1. It 
also discusses how instructions are routed among pipes. 


2.2.1. Integer Instruction Pipeline Stages 


The C790 contains four integer pipelines: the 10 and the !1 pipes, and the Load/Store and 
Branch pipes. Each pipe consists of the following six stages with each stage having 2 
phases: 


I: Instruction Address Select 
Q: Instruction Queue 

R: Register Fetch 

A: Execution 

D: Data Fetch 

W: Write-back 


Figure 2-2 shows the six stages of an integer instruction pipeline 


Current CPU 
Cycle 


Figure 2-2. C790 Integer Instruction Pipeline 
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I: Instruction Address Select 
During the! stage, the following occurs: 


e The sequential address is calculated 

e Thebranch address is calculated 

e Theinstruction address is selected from the following sources 
Sequential address 

Actual Branch /J ump address 

Predicted Branch Target address from the BTAC 

Exception vector address 

EPC and Error PC 


Q: Instruction Queue 
During the Q stage, the following occurs: 


e Theinstruction translation look-aside buffer (ITLB) does the virtual-to-physical 
address translation 

The instruction cache (data, Tag, steering bits & BHT) fetch begins 

TLB read for instruction fetch starts 

The instruction cache fetch is completed 

TLB read for instruction fetch completes 

The instruction cache Tag hit check is determined and the way selection is 
done 

e The appropriate instructions are selected by the steering bits 


R: Register Fetch 
During the R stage the following occurs: 


e Instructions are bussed to the appropriate execution units 

Register file is read 

Execution unit structural hazards are determined 

Instructions are decoded, data dependencies are determined and the 
appropriate instructions are issued 


A: Execution 
During theA stage, the following occurs: 


e Results from the D or W stages are bypassed 

e The execution units start and complete the integer arithmetic, logical, shift and 
multimedia instructions 

e Theiterative steps of the Multiply, Multiply-Accumulate, or Divide instructions 

are executed 

The virtual address for load and store instructions is calculated 

The branch condition is determined 

The DTLB is read 

The Data Cache and UCAB read starts 
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D: Data Fetch 
During the D stage, the following occurs: 


e TheTLB read for a data access 

The Data Cache and UCAB read is completed 

The Data Cache Tag checking is completed 

Load or register data is obtained from COP 1 (FPU) 

COPO registers are read 

Data alignment and way selection is done for the data from the Data Cache 
Data sign extension is done 

Complete updating BHT bits and the BTAC 

All the exceptions are detected 


W: Write Back 
During the W stage, the following occurs: 


e For store operations data is written to the Data Cache 

e Data for coprocessor data transfer instructions is transferred to COP 1 (FPU) 

e For register-to-register and load instructions, the result is written to the 
register file 

e COPO, COP1 (FPU) registers are written for coprocessor data transfer 
instructions 


Pat 
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2.2.2 C1 (COP1/FPU) Instruction Pipeline Stages 


The C790's C1 (COP1/FPU) pipeline consists of the following eight stages: 


I: Instruction Address Select 
Q: Instruction Queue 

R: Register Fetch 

T: COP1 Register Fetch 

X: FP Execution 1st Stage 
Y: FP Execution 2nd Stage 
Z: FP Execution 3rd Stage 
S: Register File Write Stage 


The eight stages of the pipeline for COP1/FPU are shown in Figure 2-3 with some pipeline 
stages identified with two letters. COP 1 instructions execute simultaneously in the main 
integer pipeline 10 and the coprocessor 1 pipeline. The first letter identifies the main 
integer pipeline stage and the second letter identifies the coprocessor pipeline stage. 


= 
TTS TR Atop zs 
PT [aye [arom pany zs | 


ed 
| | [ Q | A/T|D Z 


Current CPU Cycle 


Figure 2-3. FPU Pipeline 


Thel, Q, and R stages were previously described in Section 2.2.1. The following describes 
stages specific to the COP 1 pipeline: 


T: COP1 Register Fetch 


During theT stage, the following occurs: 


e Register file read for operands 
e Bypass muxes from the S Stage/W Stage for S/T overlap. 
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X: FP Execution 1st Stage 


This stage is the first step for floating point operations. 


During the X stage, the following occurs: 
e Detect Exceptions for input data. 
e Detect Exception possibilities for result. 
e The Booth functionWWallace multiplication is performed for multiply, the de 
nor-malization is performed for add/subtract. 


Y: FP Execution 2nd Stage 


This stage is the second step for floating point operations. The following occurs: 
Test overflow/underflow on exponent is done 

Normalization for multiplication is done. 

Add/subtract the significand for add/subtract operations. 

Count leading zeros, to determine the shift amount for the normalization 


Z: FP Execution 3rd Stage 


This stage is the third step for floating point operations. The following occurs: 


Overflow/underflow detection 
Exponent readjustment 

Shift the significand for normalization 
Round the result 

Detect inexact exception 


S: Register File Write Stage 


During the S stage, the following occurs: 


e FPR registers are written. 
e FCSR31 is updated. 
e Bypass values are passed to the T stage. 
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2.2.3 Classification and Routing of Instructions According to 
Execution Pipelines 


This section discusses how the five execution pipelines are used in conjunction with 
instruction routing. Figure 2-4 identifies the specific execution pipelines into which 
instructions of a particular class are routed, and shows which physical execution units 
handle instructions from a particular logical pipe. Instruction categories are identified in 
italics, and are shown within the physical pipes where they are executed. ALU 
instructions can be executed in either integer pipe 10 or 11. COP1 Operate, and COP1 
Move instructions execute in two pipes as shown, as does the Wide Operate. 


I1 pipe 


Load/ 
Store 
Prefetch 
CACHE 


ALU 
SA Operate 
MACO 


ee ee ee ee ee ee eee Ore ee ee 


He en am een mash ak hk ok cn a tS a. Sa cana 


BR pipe 


COP1 Move 


Physical Pipes 


Figure 2-4. Instruction Routing in Logical Pipes and Physical Pipes 
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Table 2-1 shows the categories of instructions and the execution pipelines that can execute 
those instructions. The instructions in a single category have the same issuing policy. 
Instructions which require more than a single execution pipeline are identified in the 
pipeline column with the (W&) symbol. For example, COP1 Move requires both the LS 
and the Cl execution pipelines. On the other hand, the ALU instructions can be executed 
in either the 10 or the! 1 execution pipelines. 


Table 2-1. Categories of Instructions and How They Are Routed 


Categories Execution Pipeline 
SN ES BE er 
Load/Store Load, Store, Wide Load , Wide 
Store, Prefetch, CACHE 


| SYNC SS SSeS Synchronizaven 


Perera 
SA Operate | [|__| Move toriom to SA register | 


COPO COPO0 Coprocessor move, 
COPO0 Coprocessor operations 

COP1 Move! COP1 Coprocessor move, 
COP1 Coprocessor Load/Store 

COP1 Operate? hn mn ea COP1 Operate Instructions 


a | Arithmetic, Shift, Logical, Trap, 
SYSCALL, BREAK 


ean Multiply and Multiply 
-Accumulate for HI/LO 
register, MFHI/LO, MTHI/LO 


Multiply and Multiply- 
Accumulate for Hl1/LO1 
register, MFHI1/LO1, 
MTHI1/LO1 

ae Branch, Jump, Jump/Link, All 
Coprocessor Branches 


Wide ed Wide ALU, Wide shift, Wide 
MAC, Funnel shift, Wide HI/LO 
Moves 


' COP1 Move instructions execute concurrently in the LS and the C1 pipes. 

3 COPT Operate instructions execute concurrently in the |0 and the C1 pipes. 
7; ALU instructions can be executed in either the IO or the |1 pipes. 
“Wide Operate instructions execute concurrently in the I0 and the I1 pipes. 
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2.2.4 Instruction Issue Combinations 


The C790 always fetches two instructions. A pair of staging registers acts as a ‘bellows’ 
between the Q and the R stage. If an instruction can’t be issued in a particular cycle, it is 
saved in the staging registers. In the next cycle the C790 again fetches two instructions 
and tries to issue two (the one left over in the staging register from the previous cycle and 
the next sequential one from the pair that is fetched). So the C790 always tries to issue 
two instructions each cycle whenever it can. 


The two instructions that get issued go to the R-stage of the pipeline and get associated 
with one of two logical pipes: PipeO and Pipel. The instructions are then routed to an 
appropriate physical pipe for processing. 


Instruction categories that can get issued to logical Piped are: 


ALU 

Branch 

Wide Operate 
SA Operate 
MACO 

COP1 Operate 


Ourwn > 


An alternate way to view this is to recognize that logical PipeO is made up of the 10, C1 
and BR execution pipelines. When issuing Wide Operate instructions logical Piped also 
uses the 11 execution pipeline. 


Instruction categories that can get issued to logical Pipel are: 


ALU 

Branch 
SYNC 
ERET 
Load/Store 
COP1 Move 
COPO 
MAC1 


ONOARWN> 


An alternate way to view this is to recognize that logical Pipel is made up of the I1, LS, 
C1 and BR execution pipelines. 


All instruction categories are statically bound toa single logical pipe, that is, they can only 
be issued to a particular logical pipe. However the ALU and Branch instruction categories 
can get issued to either of the two logical pipes. Thus the binding of these two instruction 
categories to a particular logical pipe is done at instruction issue time. 


There are some special cases of instruction sequences that are not allowed in the MIPS 
ISA. An instruction from the Branch category is not allowed to have another instruction 
from either the Branch or ERET category in its branch delay slot. So the following pairs of 
instructions are illegal and effectively never issued together: 


1. Branch - Branch 
2. Branch - ERET 
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The following sequences of instructions are also not allowed in the C790. Branch-Likely 
instructions are a subset of the Branch category (limited to the branch likely instructions). 


1. Branch - SYNC.P 

Branch - SYNC.L 

Branch - CACHE *1 
Branch-Likely - MTSA 
Branch-Likely - MTSAB 
Branch-Likely - MTSAH 
Branch-Likely - TLBR *2 
Branch-Likely - TLBWI *2 
Branch-Likely - TLBWR *2 


*1 CACHE instruction must be guarded by Sync instructions. 


CONDARWN 


Sync.P Sync.L 
CACHE I$ or CACHE D$ 
Sync.P Sync.L 
*2 TLBR, TLBWI, TLBWR instructions must be followed by Sync.P 
TLBxx 
Sync.P 


The following table shows the instruction categories which can be issued concurrently to 
the two logical pipes. All combinations are legal except the ones marked with an “X”. The 
combinations marked with a “Y” can be issued concurrently, i.e, enter the R stage 
together but then the younger instruction stalls in the A stage for a single cycle in order to 
avoid a resource hazard. 


Table 2-2. Concurrently Issued Instruction Categories 
LOGICAL PIPEO 


a a a 
Oper. | Oper. Oper. 

ftoadstore [TT | TT 
ERET TTT 


ee eS Se 
[cS (en (a en (eer Se 2A 


ere! 

Eee) a (a (ee |e ee 
ee i ee) 
CE a (a a (PRS 
a a | ee es ee 
Eo a (a [ee ee 


X: illegal combination 
Y: Can be issued concurrently but it will stall due to structure hazard. 


LOGICAL PIPE1 
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2.3 Registers 


The C790 extends the normal MIPS compatible register set by extending the general 
purpose registers (GPRs) from 64-bits to 128-bits, adding an additional pair of HI/LO 
registers for the! 1 pipe and adding the SA register for the funnel shift instruction. 


2.3.1 CPU Registers 


The C790 has 128-bit wide GPRs. The upper 64 bits of the GPRs are only used by the 
C790-specific “Quad Load/Store”, and “Multimedia (Parallel)” instructions. 


The HI1 and LOI, which are the upper 64 bits of each of the 128-bit HI and LO registers, 
are also used by new multiply and divide instructions, such as MULT1, MULTU1I, DIV1, 
DIVU1, MADD1, MADDU1, MFHI1, MFLO1, MTHI1, and MTL O1, which are non- 
parallel 11 pipeline-specific instructions. 

The SA register contains the shift amount used by the 256 bit funnel shift instruction. 


2.3.2 FPU Registers 


The floating point unit (COP 1) has 64-bit wide floating point registers. It also contains 2 
floating point control registers . 
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2.3.3 COPO Registers 
Table 2-3 identifies the COPO registers of the C790. 


Table 2-3. Coprocessor 0 Registers 


Register| Register Description Purpose 
No. Name 
Programmable register to select TLB entry for reading or 
index writing 


| 1 | Random | Pseudo-random counter | Pseudo-random counter for TLB replacement =| TLB _Peoudbrandem cutee TAS repeement —___ 

eee 
P68 | Wied | Numberofwiea TB enties SiC 
p 8 Baavacr [Badvinwataaiess —SSSCSC~*diC on 
poo | timercompareS~*dC et 
P15 [pre | Proessor Revision enter ————SSSCSCSCS~tCia 
i el Te 
i 
pat | eserves) [ Undefined SSSSSCS~*C 
p22 | fesenes) | Undeined SS SSS~*dC rr 
28 [ eaaPaaar | BeaPhysiatnauess ——SSSSCSCSCSC~*diC on 
Cache Tag register(high bits) 
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2.4 Memory Management 


The C790 processor provides a memory management unit (MMU) which uses an on-chip 
translation look-aside buffer (TLB) to translate virtual addresses into physical addresses. 


The C790 supports the MIPS compatible 32-bit address and 64-bit data mode. Only 32-bit 
virtual and physical addresses have been implemented. There is no requirement for 
address sign extension. Address error exception checking will not be done on the “upper” 
32-bits (which are ignored). The only condition that will generate the address error 
exception will be address alignment errors and segment protection errors. |n Kernel mode, 
it is free from address error exception for program counter to wrap-around from kseg3 to 
kuseg. 


Since there is only one addressing mode, all the four MIPS ISAs (I, II, III, 1V) and the 
C790 specific |SA are available without any restrictions in all of the three processor modes 
(with the appropriate MIPS ISA coprocessor usable restrictions). As such the reserved 
instruction (RI) exception will occur only when the processor really tries to execute an 
undefined opcode. 


Features 
e MIPS IIl-compatible 32-bit MMU 
e Operating Modes: User, Supervisor, and Kernel 
e TLB: 48 entries of even/odd page pairs (96 pages) 
Fully associative 
e Page Size: 4KB, 16 KB, 64KB, 256KB, 1MB, 4MB, 16MB 
e ITLB: 2 entries 
e DTLB: 4 entries 
e Address Sizes: Virtual Address Size =32 bit, 2 Gbyte per user Process 


Physical Address Size =32 bit, 4 Gbyte 
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2.5 Cache Memory 
The C790 core contains both an instruction cache and a separate data cache. 
Features 


The following are the main features of the caches: 


Separate I nstruction Cache and Data Cache 
Virtually indexed and physically tagged caches 
Write-back policy for the Data Cache 
Data Cache and Instruction Cache burst read sequential ordering 
Cache Size: Instruction Cache: 32 KB 
Data Cache: 32 KB 
Line Size: 64 Bytes 
Refill size: 64 Bytes 
Associativity: 2-way set-associative 
Write Policy: Write-back and write allocate 
Data order for block reads: Sequential ordering 
Data order for block writes: Sequential ordering 
Instruction cache miss restart: After all data received 
Data cache miss restart: Early restart on first quadword 
Cache parity: No 
Cache Locking: Data Cache Line Lock. 
Controlled by CACHE instruction 
Cache Snooping: No 
Non-blocking load: Yes 
Hit Under Miss: Yes (Multiple hits under one miss are supported) 
Data Cache Prefetch: Yes 
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2.6 Bus Interface 


The C790 CPU core is connected to the rest of the system, and to external devices, through 
the group of on-chip C790 system bus signals called the CPU Bus. 


Features 


e Separate data and address buses (Demulti plexed operation) 
128-bit data bus 

Clocked synchronous operations 

Peak transfer rate of 2.1 GB/sec (@133 MHz bus clock) 
8/16/32/64/128-bit and burst accesses 

Multimaster capability 

Pipelined operations 

Noturn-around or dead cycles between transfers 


The CPU Bus does not provide: 


e Cache coherency support 
e Split transactions 


2.7 Floating Point Unit 


The floating point unit is |EEE 754-1985 compatible as same as FPU in the TX49HF CPU 
core. 


Main Features: 


e Tightly coupled to the C790 Integer pipeline. 

e Supports both double and single precision format as defined in IEEE-754 
specification 

e Nohardware support for Denormalized number in the |EEE-754 specification. 
Software (exception handler) supports it. 

e TheFPU supports five |EEE exceptions and one MIPS defined exception. 

e ADD, SUB, MUL, DIV, ABS, MOV, NEG, SQRT, compare and convert are 
supported 
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2.8 Performance Counter 


The performance counter provides the means for gathering statistical information about 
the internal events of the CPU and the pipeline during program execution. The statistics 
gathered during program execution aid in tuning the performance of hardware and 
software systems based on the processor. 


The performance counter consists of one control register and two counters. The control 
register controls the functions of the performance counter while the counters count the 
number of events specified by the control register. 


Features: 


e Two performance counter registers 

e Over twenty different events within the processor can be counted 

e Counting can be selectively enabled in User, Supervisor, Kernel, and Exception 
modes 


2.9 Debug and Tracing Functions 


The C790 supports real-time PC tracing. Pipeline status, target addresses of indirect 
jumps, and exception vectors are made available on special signals. The executed 
instruction sequence can be restored from signals and the source program. 


Features: 


One Instruction Address Breakpoint register 

One Instruction Address Breakpoint Mask register 

One Data Address Breakpoint register 

One Data Address Breakpoint Mask register 

One Data Value Breakpoint register 

One Data Value Breakpoint Mask register 

Each breakpoint individually enabled 

Breakpoint function can be selectively enabled in User, Supervisor, Kernel, and 
Exception modes 

External Trigger signal can be generated when breakpoint occurs 
e 11 signals used to provide real-time PC tracing function 


2-19 


TX 
TOSHIBA Chapter 2 Architecture Overview ee” 


2-20 


1X 
TOSHIBA Chapter 3 Instruction Set Overview and Summary us” 


3. Instruction Set Overview and Summary 


This chapter provides an overview of the C790 instruction set. Refer to Appendices A - D 
for detailed descriptions of individual instructions. 
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3.1 Introduction 


The C790 supports all MIPS III instructions with the exception of 64-bit multiply, 64-bit 
divide, Load Linked and Store Conditional instructions. It also supports a limited number 
of MIPS IV instructions and additional C790-specific instructions, such as Multiply/Add 
instructions and multimedia instructions. 


The instruction set can be divided into the following groups: 


Load and Store 

Computational 

J ump and Branch 

Miscellaneous 

System Control Coprocessor (COP 0) 
Coprocessor 1 (COP 1) 

C790-specific 


3-2 


TX 
TOSHIBA Chapter 3 Instruction Set Overview and Summary ee 


3.2 CPU Instruction Set Formats 


There are three instruction formats: immediate (|-type), jump (J -type), and register (R- 
type), as shown in Figure 3-1. The use of a small number of instruction formats simplifies 
instruction decoding (thus producing higher frequency operations) and allows the compiler 
to synthesize more complicated (and less frequently used) operations and address modes 
from these three formats as needed. 


l-type (Immediate) 
31 2625 2120 16 15 0 


timate 
J-type (Jump) 


31 26 25 0 

a 
R-type (Register) 

31 26 25 2120 1615 1110 65 


i 


op 6-bit operation code 

rs 5-bit source register specifier 

rt 5-bit target (Source/destination) register or branch condition 
immediate 16-bit immediate value, branch displacement or address displacement 
target 26-bit jump target address 

rd 5-bit destination register specifier 

sa 5-bit shift amount 

funct 6-bit function field 


Figure 3-1. CPU Instruction Formats 
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3.3 Instruction Set Summary 


The C790 supports MIPS III instructions! as well as a limited number of MIPS IV 
instructions. A large number of C790-specific instructions, such as multiply/add 
instructions and multimedia instructions have also been implemented. 


3.3.1. Load/Store Instructions 


The instructions in this group transfer data of different sizes: bytes, halfwords, words, 
doublewords and quadwords. Signed and unsigned integers of different sizes are 
supported by loads that either sign-extended or zero-extended the data loaded into the 
register. 


Load and store instructions are immediate (l-type) instructions that move data between 
memory and the general registers. The only addressing mode that load and store 
instructions directly support is base register plus 16-bit signed immediate offset. 


3.3.1.1. Normal Loads and Stores 


The C790 does not support Load Linked and Store Conditional instructions, LL, LLD, SC 
and SCD. For details of these instructions refer to Appendix A. 


Table 3-1. Load / Store Instructions 


Mnemonic Defined in 
Pip teed Bye es 
MPS 
Load Word Unsigned 

Store Doubleword Right 
MIPS 


' Note: The C790 does not support the following MIPS III instructions: 
64-bit multiply and divide instructions (DMULT, DMULTU, DDIV, DDIVU) 
Semaphore instructions (LL, LLD, SC, SCD) 
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3.3.1.2 Multimedia Loads and Stores 


The C790 implements 128-bit (quadword) load and store instructions for multimedia 
purpose. For details of these instructions refer to Appendix B. 


Table 3-2. Multimedia Load / Store Instructions 


| Mnemonic | Description _‘|_Definedin___| 


Load Quadword C790 
Store Quadword C790 


3.3.1.3. Coprocessor Loads and Stores 


These loads and stores are coprocessor instructions. A particular coprocessor is enabled if 
corresponding CU bit is set in CPO Status register. Otherwise executing one of these 
instructions generates a Coprocessor Unusable exception. F or details of these instructions 
refer to Appendices C and D. 


Table 3-3. Coprocessor Load / Store Instructions 


| Mnemonic__| Description | -Definedin 
Point 
LWC1 Load Word to Floating Point MIPS | 
SDC1 Store Doubleword from Floating MIPS II 
Point 


SWC1 Store Word from Floating Point MIPS | 


3.3.1.4 Data Formats and Addressing 


The C790 processor uses five data formats: 


128-bit quadword 
64-bit doubleword 
32-bit word 

16-bit halfword 
8-bit byte 


Byte ordering within each of the larger data formats — halfword, word, doubleword — can 
be configured in either big-endian or littleendian order. Endianness refers to the location 
of byte O within the multi-byte data structure. Figure 3-2 and Figure 3-3 show the 
ordering of bytes within words and the ordering of words within multipleword structures 
for the big-endian and little endian conventions. 


When the C790 processor is configured as a big-endian system, byte O is the most- 
significant (leftmost) byte, thereby providing compatibility with MC 68000® and |IBM 370® 
conventions. Figure 3-2 shows this configuration. 
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Bit # 


Higher Word a 
Address Address | 31 24 23 1615 87 0 


12 
8 


Lower 
Address 0 


Figure 3-2. Big-Endian Byte Ordering 


When configured as a littlheendian system, byte O is always the least-significant 
(rightmost) byte, which is compatible with iAP X® x86 and DEC VAX® conventions. 


Bit # 


Higher Word ia ae eer ee aaa 
Address Address | 31 24 23 1615 87 0 


12 
8 


Lower 
Address 0 


Figure 3-3. Little-Endian Byte Ordering 


In this text, bit 0 is always the least-significant (rightmost) bit: thus, bit designations are 
always littleendian (although no instructions explicitly designate bit positions within 
words). 
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Figure 3-4 and Figure 3-5 show littleendian and big-endian byte ordering in doublewords. 


Most-significant byte Least-significant byte 
Least significant Word 
| 
| 
Bit# 63 \5655 4847 4039 3231 2423 1615 87 0 
Byte # 


Halfword Byte 


Bt#'7 6543 2 14 
SCL 


Bits in a Byte 


Figure 3-4. Little-Endian Data in a Doubleword 


Most-significant byte Least-significant byte 
Least significant Word 


2423 1615 


Halfword Byte 


Bt#' 76 543 2 1 
TT 


Bits ina Byte 


Figure 3-5. Big-Endian Data in a Doubleword 
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The CPU uses byte addressing for halfword, word, doubleword, and quadword accesses 
with the following alignment constraints: 


e Halfword accesses must be aligned on an even byte boundary (0, 2, 4...). 

e Word accesses must be aligned on a byte boundary divisible by four (0, 4, 8...). 

e Doubleword accesses must be aligned on a byte boundary divisible by eight (0, 8, 
16...). 


e Quadword accesses must be aligned on a byte boundary divisible by sixteen (0, 
16, 32...). 


The following special instructions load and store words that are not aligned on 4-byte 
(word), 8-byte (doubleword), boundaries: 


LWL LWR SWL SWR 
LDL LDR SDL SDR 


These instructions are used in pairs to provide addressing of misaligned words. 
Addressing misaligned data incurs one additional instruction cycle over that required for 
addressing aligned data. This extra cycle is because of an extra instruction for the “pair” 
(e.g... WL and LWR form a pair). Also note that the CPU moves the unaligned data at the 
same rate as a hardware mechanism. 


Figure 3-6 and Figure 3-7 shows the access of a misaligned word that has byte address 3. 


Bit # 


Higher 
Address 31 2423 1615 87 0 


Lower 
Address 


Figure 3-6. Big-Endian Misaligned Word Addressing 
Bit # 
Higher 
Add 
ae 31 2423-16 15 87 0 
| —_—_—_—_— Re | |e 
ss ti(<é‘ ||. 
Lower 
Address 


Figure 3-7. Little-Endian Misaligned Word Addressing 


3-8 


TX 
TOSHIBA Chapter 3 Instruction Set Overview and Summary We ” 


3.3.1.5 Defining Access Types 


Access type indicates the size of the C790 processor data item to be loaded or stored, set 
by the load or store instruction opcode. 


Regardless of access type or byte ordering (endianess), the address given specifies the low- 
order byte in the addressed field. For a big-endian configuration, the low-order byte is the 
most-significant byte; for a littleendian configuration, the low-order byte is the least- 
significant byte. 


The access type, together with the four low-order bits of the address, defines the bytes 
accessed within the addressed doubleword (shown in Table 3-4 and Table 3-5). Only the 
combinations shown in Table 3-4 and Table 3-5 are permissible; other combinations cause 
address error exceptions. 
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Table 3-4. Defining Access Types (Big-Endian) 


Access Type Low-Order Bytes Accessed 


Mnemonic Address Big endian 
Bits (127--------------- Q5---------------- 63----------------- 31 ---------2-222--- 0) 
3 2 1 #0 Byte 


Quadword fo | o fo | o Jolt 2 [34 }s5[6]7 [8] 9 [ios | 12/13 | 14 | 15) 
Doubleword == | o | o | o | o [ol 1|2) 3/4) 5 | 6 | 7 
| 1 | o | o | 0 MMe 2 | 9 [10] 11 [12] 13/14] 15, 
0 0 0 0 ;O;/1;2/3]/4)5)6 


Septibyte 


PoP al Lio ft [va 
rofoyoto lotrel) +| > aaa 
}o|o|1 | o PM 2|3|4| 5/6 | 7 
| 1 [0 | o | 0 ie § | 9 [10] 11|12| 13 
(1 | o | 1 | 0 ee 10) 11112] 13 |14 | 15) 
|o | o | o | o jolt) 2) 3 | 4 a rr 
|o|o | 1 | 1 i s | ¢|5 | c | 7 
|i [0 | o | 0 ie © | ° | 10) 11 | 12 
LZ €6=Eslri‘é‘a‘a‘i‘i‘i‘é3«Mt”UUlU UE 
|o [o | o | o |o|+| 2 | 3 ct i ee 
|o| + | o | o i + | 5 | ¢ | 7 
|i [oc | o | > ie © | ° | 10 | 1: 
a li (<i<i‘<‘<‘<‘<‘<‘i«wX 


oe 8€©886foeClUlUlt” 
|i |o|o | 0 ii © | ° | 10 [i io 
|i {ol o || ee > | 10 | +1 
[1 | 1 | o | 0 ee 12 | 19 |14 ie 
epee FARE 


asf 8€=68©6feCCClUlt™” 
[1 | o | o | 0 ii © | ° [to oi 
=i‘i‘i‘éi‘i‘(<a<‘<;}UCrhcr ll 
[1 | 1 | o | 0 i (2 | 13 
ttt Lt | 0 eee *¢ | 15 | 
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Access Type Low-Order Bytes Accessed 
Mnemonic Address Big endian 
Bits (127--------------- Q5-----nnnnnnnnnn= 63-----------22-=-- 31 -----------2----- 0) 
3.2 +1 ~«~0 Byte 


| o | o | o | o | o a ie ee ee ee 
polo|o|: i: ee eee 
polo |: | o i 2 ee eee eee 
polo! : | + i > ee eee 
1 4 

oli |o|: a > eee 
fol 1/1 | o i © eee 
oli li || a 7 eee 
rT Olli CO 
pi [olo |: ee ° eee 
RT: ll ti‘ ( ( 
a Oli ( 
|i [1] o | o i eee ‘2 eee 
Pili |o |) eee 3 ee 

1l1[1|)o 7 ee (4 ie 
Ek) )=lia ‘(keel a 
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Table 3-5. Defining Access Types (Little-Endian) 


Access Type Low-Order Bytes Accessed 


Mnemonic Address Little endian 
cae (127--------------- Q5---------------- i me ceneeen nee nnens 31----------------- 0) 


aamod fo To Too fale ale[n popes [rey [e [seo 
Ce _ arama ee 


Popo Poppe 0s [eee eel 
| 1 | o | a | 1 [45] 14] 13] 12 | 11 rr 
(o | o | 0 | 0 i! 3 | 2 | 1 | o 
|o | 1 |o | 0 i 7 | c | s | + 
| 1 | o | o | 0 TE +1 [10] | 5 i 
| 4 | 1 | o | o [415] 14|13| 12 rr 
|o | o | 0 | 0 a 2 | 1 | 0 | 
= =—rlti‘Ci‘CisC i‘ Ci;CWUULULBL 
|o | 1 | 0 | 0 ee ¢ | 5 | + 
(ol 1 |o |: i 7 |: | s 
|i | 0 | o | 0 a ‘| | ° a 
rT = =$ssL sr ae 
| 1 | 7 | o | 0 Ji 14 |13| 12 
| 1 | 1 | o | 1 |15|14 | 13 
0 | 0 | 0 | 0 a | | 0 | 
(ol ol) | os > | 2 
aT =—lti‘(‘(i‘iCOi‘(‘(;(< ‘():COUiwUMeC 
(ol 1 |) | o i 7 | ° 
|i |o | o | 0 is ° | © i ee ee 
[1 | o | 1 | o a +: | 10 le 
| 1 | 7 | o | 0 RRR 13 | 12 re 
Pt ft ft | o [15 | 14 
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Access Type Low-Order Bytes Accessed 
Mnemonic Address Little endian 
Bits (127--------------- Q5---------------- 63----------0-2-=-- 31 -----------2----- 0) 
3.2 +1 ~=~20 Byte 


R= = =Sei i iststs—“Ci‘iai‘ Ee 

= i ee 

Polo |: | o cn 2 ie 

polo |) || eee > eee 
1 4 


|o |; | 0 |) ee 5 
eli i i (a<€C3Uwi‘i‘i‘i‘i‘iacshsltlthhhhhhh 
[oi | ||) 


os oo © 
| 1 [+ | o | 1 [ey 1s 
[1 [+ | 1 | o [i 1s 
Mee 


3.3.1.6 Scheduling a Load Delay Slot 


A load instruction that does not allow its result to be used by the instruction immediately 
following is called a delayed load instruction. The instruction slot immediately following 
this delayed load instruction is referred to as the /oad delay slot. 


In the C790 processor, the instruction immediately following a load instruction can use 
the contents of the loaded register. In such cases, however, hardware interlocks insert 
additional clock cycles. Consequently, scheduling load delay slots can be desirable, both 
for performance and R-Series processor compatibility. However, the scheduling of load 
delay slots is not absolutely required. 
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3.3.2 Computational Instructions 


The instructions in this group perform two’s complement arithmetic, logical operations, or 
shifts on integers represented in two’s complement notation. 


Computational instructions can be either in register (R-type) format, in which both 
operands are registers, or in immediate (l-type) format, in which one operand is a 16-bit 
immediate. 


Computational instructions perform the following operations on register values: 
Arithmetic 

Logical 

Shift 

Multiply 

Divide 

These operations fit in the following four categories of computational instructions: 
ALU immediate instructions 

Three-Operand Register-Type instructions 

Shift instructions 

Multiply and Divide instructions 


For detailed information of individual instructions, refer to Appendix A. 


*Note: The C790 does not support 64-bit Multiply and Divide instructions, DMULT, DMULTU, 
DDIV, and DDIVU. 


3.3.2.1. ALU Immediate Instructions 


Table 3-6. ALU Immediate Instructions 


[Mnemonic | Description] Definedin] 


Poni | OR immediate _—=S~S~S CPSC 
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3.3.2.2 Three Operand Register-Type Instructions 
Table 3-7. Three Operand Register-Type Instructions 


[Mnemonic | Description] Definedin] 
ADD Add MIPS I 
SUBU 


DADD Doubleword Add 


= 


U U 
n n 


IPS III 


DADDU Doubleword Add Unsigned IPS Ill 


DSUB Doubleword Subtract IPS Ill 


DSUBU iPS 
SLT PS 
SLTU SI 
ND 
Porson SOS—SCSCSCOCCCCS PST 


> 


3.3.2.3 Shift Instructions 
Table 3-8. Shift Instructions 


| Mnemonic | Description | Defined in | 


3.3.2.4 Multiply and Divide Instructions 


These are the standard MIPS instructions for multiply, divide, and move to/from HI! /LO 
registers executed on the I0 pipeline’s MAC unit. See also C790-specific Multiply and 
Divide instructions discussion. 


Table 3-9. Multiply and Divide Instructions 


| Mnemonic | Description Defined in 
MULT Multiply MIPS | 
MULTU Multiply Unsigned MIPS | 
MIPS | 


3.3.2.5  64-Bit Operations 


The result of operations that use incorrect sign-extended 32-bit values for 64-bit 
operations is unpredictable. 
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3.3.3. Jump and Branch Instructions 


The architecture defines PC-relative conditional branches, a PC-region unconditional 
jump, an absolute (register) unconditional jump, and a similar set of procedure calls that 
record a return link address in a general register. For convenience, these are all referred 
to here as branches. 


All branches have an architectural delay of one instruction. When a branch is taken, the 
instruction immediately following the branch instruction, in the branch delay slot, is 
executed before the branch to the target instruction takes place. Conditional branches 
come in two versions that treat the instruction in the delay slot differently when the 
branch is not taken and execution falls through. The ‘branch’ instructions execute the 
instruction in the delay slot, but the ‘branch likely’ instructions do not. (They are said to 
‘nullify’ it.) 


By convention, if an exception or interrupt prevents the completion of an instruction 
occupying a branch delay slot, the instruction stream is continued by reexecuting the 
branch instruction. To permit this, branches must be restartable; procedure calls may not 
use the register in which the return link is stored (usually register 31) to determine the 
branch target address. 


For detailed information of individual instructions, refer to Appendix A. Branch on 
Coprocessor instructions are covered under coprocessor’s discussions. 


3.3.3.1. Jump Instructions 


Subroutine calls in high-level languages are usually implemented with J ump or J ump and 
Link instructions, both of which are J -type instructions. In J -type format, the 26-bit target 
address shifts 2 bits and combines with the high-order 4-bits of the current program 
counter to form an absolute address. 


Returns, dispatches, and large cross-page jumps are usually implemented with the J ump 
Register or J ump and Link Register instructions. Both are R-type instructions that take 
the 32-bit byte address contained in one of the general purpose registers. 


Table 3-10. Jump Instructions Jumping Within a 256 MByte Region 


| Mnemonic | Description Defined in 
MIPS | 
JAL ~~ [Jump and Link MIPS | 


Table 3-11. Jump Instructions to Absolute Address 


| Mnemonic | Description Defined in 
JR ———_ [Jump Register MIPS | 
JALR Jump and Link Register MIPS | 
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3.3.3.2 Branch Instructions 


All branch instruction target addresses are computed by adding the address of the 
instruction in the branch delay slot to the 16-bit offset (shifts left 2 bits and is sign- 
extended to 32-bits). All branches occur with a delay of one instruction. 


In case of a Branch Likely instruction, if a condition is not taken, the instruction in the 
delay slot is nullified. 


Table 3-12. PC-Relative Conditional Branch Instructions Comparing 2 Registers 


| Mnemonic | Description Defined in 
Branch on Equal MIPS | 
Branch on Not Equal MIPS | 
BLEZ Branch on Less Than or Equal to Zero MIPS | 


Table 3-13. PC-Relative Conditional Branch Instructions Comparing Against Zero 


| Mnemonic | Description Defined in 

ae Branch on Less Than Zero MIPS | 

Branch on Greater Than or Equal to Zero MIPS | 
BETSAC Branch on Less Than Zero and Link MIPS | 


BGEZAL Branch on Greater Than or Equal to Zero and |MIPS | 
Link 


BLTZL Branch on Less Than Zero Likely MIPS II 
BGEZL Branch on Greater Than or Equal to Zero Likely|MIPS II 
BLTZALL Branch on Less Than Zero and Link Likely MIPS II 


BGEZALL Branch on Greater Than or Equal to Zero and _ {MIPS II 
Link Likely 
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3.3.4 Miscellaneous Instructions 


3.3.4.1. Exception Instructions 


Exception instructions have as their sole purpose causing an exception that will transfer 
control to a software exception handler in the kernel. System call and breakpoint 
instructions cause exceptions unconditionally. The trap instructions cause exceptions 
conditionally based upon the result of a comparison. For detail of these instructions, refer 
to the individual instruction as described in Appendix A. 


Table 3-14. Exception Instructions 


System Call 
rap if Greater or Equal 


TNE] ————__—s[ Trap if Not Equal Immediate MIPS II 


3.3.4.2 Serialization Instructions 


The order in which memory accesses from load and store instructions appear outside the 
C790 is not specified by the architecture The SYNC (or SYNC.L) instruction creates a 
point in the executing instruction stream at which the relative order of some loads and 
store is known. Loads and stores executed before the SYNC (or SYNC.L) are retired before 
loads and stores after the SYNC (or SYNC.L) can start. 


In order to guarantee the completion of certain instructions a SYNC.P instruction can be 
used. Instructions executed before a SYNC.P instruction are completed before instructions 
after the SYNC.P can start. For detail of this instruction refer to SYNC instruction as 
described in Appendix A. 


Table 3-15. Serialization Instructions 


SYNC” Synchronization MIPS II 


? This includes the SYNC, SYNC.L and SYNC.P instructions. 
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3.3.4.3 MIPS IV Instructions 


The C790 supports a part of the MIPS IV instructions: Conditional Move instructions and 
Prefetch instruction. 


Conditional move operations allow ‘IF’ statements to be represented without branches. 
THEN’ and ‘ELSE’ clauses are computed unconditionally and the results are placed in a 
temporary register. Conditional move operations then transfer the temporary results to 
their true register. 


The Prefetch instruction fetches data expected to be used in the near future and places it 
in the data cache. 


For detail of these instructions, refer to the individual instruction as described in 
Appendix A. 


Table 3-16. MIPS IV Instructions 


MOVN Move Conditional on Not Zero MIPS IV 
MOVZ Move Conditional on Zero MIPS IV 


PREF Prefetch MIPS IV 
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3.3.5 System Control Coprocessor (COPO) Instructions 


COPO instructions perform operations specifically on the System Control Coprocessor 
registers to manipulate the memory management, exception handling, performance 
monitor, and debug facilities of the processor. 


COPO instructions are enabled if the processor is in Kernel mode, or if bit 28 (CU) is set in 
the Status register. Otherwise executing one of these instructions generates a Coprocessor 
Unusable E xception. 


For details of COPO instructions refer to Appendix C. 


Table 3-17. System Control Coprocessor Instructions 


| Mnemonic | Description Defined in 
BCOF Branch on Coprocessor 0 False MIPS | 
BCOT Branch on Coprocessor 0 True MIPS | 
BCOFL Branch on Coprocessor 0 False Likely MIPS II 
COTL Branch on Coprocessor 0 True Likely MIPS II 
pees 


B 

as Ea = =——7 

Dl CDisable Interrupt CIO 
Probe TLB for Matching Entry 
se se 


P= as —— P= 
Se 


—as ae 
MTBPC Move To Breakpoint Control Register C790 
MFBPC Move From Breakpoint Control Register C790 


MTDAB Move To Data Address Breakpoint Register C790 
MFDAB Move From Data Address Breakpoint Register |C790 


Register 
Register 
Register 
Register 
Register 
Mask Register 


MTDVB Move To Data Value Breakpoint Register C790 
MFDVB Move From Data Value Breakpoint Register C790 
MTDVBM Move To Data Value Breakpoint Mask Register |C790 


MFDVBM Move From Data Value Breakpoint Mask C790 
Register 
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3.3.6 Coprocessor 1 (COP1) 


Coprocessor instructions perform operations in their respective coprocessors. Coprocessor 
loads and stores are |-type, and coprocessor computational instructions have coprocessor- 
dependent formats. Coprocessor load and store instructions are summarized in 3.3.1.3. 


3.3.6.1 Coprocessor 1 (COP1) Instructions 


COP 1 instructions are enabled if bit 29 (CU) is set in the Status register. Otherwise 
executing one of these instructions generates a Coprocessor Unusable Exception. For 
details of COP1 instructions refer to Appendix D. 


Table 3-18. Coprocessor 1 Instructions 


| Mnemonic | Description = |«éDeefined in| 
Move Control Word from Floating Point 


CEIL.L.fmt Floating Point Ceiling Convert to Long Fixed MIPS III 
Point 

CEIL.W.fmt Floating Point Ceiling Convert to Word Fixed MIPS II 
Point 
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3.3.7 C790-Specific Instructions 


The C790 extends its instruction set from the original MIPS architecture. The following 
instructions are supported: 


Three-operand Multiply and Multiply/Add instructions 
Multiply instructions for Pipeline 1 

Multimedia instructions 

Enable interrupt and Disable interrupt instructions 


For more information, refer to Appendices B and C. 
3.3.7.1. Integer Multiply / Divide Instructions 


The standard MIPS instructions for multiply, divide and move to / from HI / LO registers 
execute on the 10 pipeline’s MAC unit. A complete set of new instructions has also been 
defined to execute on the! 1 pipeline’s MAC unit. All of these instructions are shown in the 
following table. 


Table 3-19. C790-Specific Multiply and Divide Instructions 


OpCode | Description OpCode Description 
(Three Operand Multiply and Multiply-add) DIV1 
MADD Multiply/Add DIVU1 Divide Unsigned 1 


MADDU Multiply/Add Unsigned MADD1 Multiply/Add 1 
MULT Multiply(3-operand) MADDU1 Multiply/Add Unsigned 1 
MULTU Multiply Unsigned(3-operand) MFHI1 Move From HI 1 


(Multiply Instructions for Pipeline 1) MFLO1 Move From LO 1 


MULT1 Multiply 1 MTHI1 Move To HI 1 
MULTU1 Multiply Unsigned 1 MTLO1 Move To LO 1 


The C790 supports three-operand multiply instructions that store the multiply result toa 
general purpose register in addition to the LO register. These instructions, as such, don’t 
have to use the MFLO instruction to move data from the LO register to a general purpose 
register. 
e MULT rd, rs, rt HI | | LO =rs * rt (signed) 
rd =new LO contents 
e MULTUrd,rs,rt HI || LO=rs* rt (unsigned) 
rd =new LO contents 


The C790 also supports new multiply-add instructions, MADD and MADDU. These 
instructions execute multiply-accumulate operations using the HI and LO registers as 
accumulators. 


e MADD rd, rs, rt HI | | LO + =rs * rt (signed) 
rd =new LO contents 

e MADODU rd,rs,rt HI || LO +=rs * rt (unsigned) 
rd =new LO contents 
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3.3.7.2 Multimedia Instructions 


The C790 defines a new set of instructions to support multimedia applications. These 
instructions are shown in Table 3-20. Most of these instructions do parallel operations on 
data by combining the execution units of the two pipelines (10 and 11). They form a 128-bit 
path and then do parallel operations on either two 64-bit data items, four 32-bit data 
items, eight 16-bit data items, or sixteen 8-bit data items. 


In order to support the 128-bit datapath, 128-bit load/store operations are also 
implemented. 


Table 3-20. Multimedia Instructions 


OpCode OpCode Description 


(Absolute) 


PADDB PABSH Parallel Absolute Halfword 
PSUBB PABSW Parallel Absolute Word 
(Multiply and Divide) 


PADDH Parallel Add Halfword 
PMULTW el Multiply Word 


PSUBH Parallel Subtract Halfword 
PADDW Parallel Add Word PMULTUW Parallel Multiply Unsigned 
Word 


PSUBW Parallel Subtract Word 
PDIVW Divide Word 


PADSBH Parallel Add/Subtract 
Halfword PDIVUW Divide Unsigned 

PADDSB Parallel Add with Signed 
Saturation Byte 

PSUBSB Parallel Subtract with Signed 
Saturation Byte 

PADDSH Parallel Add with Signed 
Saturation Halfword 

PSUBSH Parallel Subtract with Signed 
Saturation Halfword 

PADDSW Parallel Add with Signed 
Saturation Word 

PSUBSW Parallel Subtract with Signed 
Saturation Word 

PADDUB Parallel Add with Unsigned 
Saturation Byte 

PSUBUB Parallel Subtract with 
Unsigned Saturation Byte 

PADDUH Parallel Add with Unsigned 
Saturation Halfword 

PSUBUH Parallel Subtract with 


Unsigned Saturation 
Halfword 


PADDUW Parallel Add with Unsigned 
Saturation Word 

PSUBUW Parallel Subtract with 

Unsigned Saturation Word 


U 
i) 
2 
2 


(0) 


PMADDW 
PMADDUW 


U 
is) 
5 
=F 
0) 


Multiply/Add Word 


Multiply/Add 
ed Word 


Multiply/Subtract 


3 0 


PMSUBW 


0) 


PMFHI 
PMFLO 
PMTHI 

PMTLO 


| Move From HI 
Move From LO 

| Move To HI 

| Move To LO 
Multiply Halfword 
PMADDH Multiply/Add 
Halfword 
PMSUBH Parallel Multiply/Subtract 
Halfword 

Parallel Move From HI/LO 
Parallel Move To HI/LO 
Parallel Horizontal 
Multiply/Add Halfword 
Parallel Horizontal 
Multiply/Subtract Halfword 


Parallel Divide Broadcast 
Word 


vl/uls VIC 
MIMlo Ls 
slaloal2 
PIP2lJa lls 
fo) 


(0) 


U 
(0) 


U 
i) 
2 


PMFHL 
PMTHL 
PHMADH 


PHMSBH 


PDIVBW 
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Description 


(SA Operation) 
MFSA 

MTSA 

MTSAB 


MTSAH 


Move from SA Register 
Move to SA Register 
Move Byte Count to SA 
Register 


Move Halfword Count to SA 
Register 


(Shift) 
PSLLH 


PSRLH 
PSRAH 
PSLLW 
PSRLW 
PSRAW 
PSLLVW 
PSRLVW 


PSRAVW 


Parallel Shift Left Logical 
Halfword 


Parallel Shift Right Logical 
Halfword 


Parallel Shift Right Arithmetic 
Halfword 


Parallel Shift Left Logical 
Word 


Parallel Shift Right Logical 
Word 


Parallel Shift Right Arithmetic 
Word 
Parallel Shift Left Logical 
Variable Word 

Parallel Shift Right Logical 
Variable Word 

Parallel Shift Right Arithmetic 
Variable Word 


(Logical) 
PAND 


PXOR 
PNOR 


U] U0 
go | 
a | a 
@ | 9 


AND 
OR 

XOR 
NOR 


ul 
© |p 
Ris 
o)}2 
o|o)/o]o 


(Compare) 
PCGTB 


PCEQB 
PCGTH 
PCEQH 
PCGTW 


PCEQW 


(0) 


Compare for Greater 
yte 


Compare for Equal 


om Res) 


Compare for Greater 
alfword 


Compare for Equal 


To 


Compare for Greater 
Word 


el Compare for Equal 
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(Quadword Load Store) 


PEXTLB 
PEXTUH 
PEXTLH 
PEXTUW 
PEXTLW 


PEXT5 
PPACS5 
(Others) 
PCPYH 
PCPYLD 


PCPYUD 


PREVH 
PINTH 
PEXEH 


PEXCH 


PEXEW 


PEXCW 


PROT3W 
QFSRV 


PLZCW 


Parallel Interleave Even 
Halfword 


| Pack To Word 
Extend Upper From 


Extend Lower From 


Extend Upper From 


Extend Upper From 
Extend Lower From 


Extend from 5 bits 
| Pack to 5 bits 


Copy Halfword 


Copy Lower 
leword 


el Copy Upper 
leword 


Parallel Reverse Halfword 
Interleave Halfword 


Exchange Even 
Halfword 


Parallel Exchange Center 
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3.4 User Instruction Latency and Repeat Rate 


Table 3-21 shows the latencies and repeat rates for all user instructions executed in 10, 11, 
BR, LS and Cl execution pipelines. Kernel instructions are not included, nor are 
instructions not issued to these execution pipelines. See Figure 2-1 and Figure 2-4 for 
execution pipeline name. 


Table 3-21. Latencies and Repeat Rates for User Instruction 
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4. CPU and COP0 Registers 


This chapter describes the CPU registers and the System Control Coprocessor (COPO) 
registers. 


The CPU registers group consists of: 


e General Purpose Registers (GPRs), 

e Multiply and Divide registers (HI and LO registers) that hold the results of 
integer multiply and divide, 

e TheSA register which is used by the funnel shift instructions, 

e TheProgram Counter (PC) register. 


The COPO registers control the processor state and report its status. These registers can 
be read using the MF CO instruction and written using the MTCO instruction. 
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4.1 CPU Registers 


The central processing unit (CPU) provides the following registers: 


e 32 128-bit General Purpose Registers (GPR) 

e Four registers that hold the results of integer multiply and divide operations 
(HIO, LOO, HI11, andLO1) 

e Shift Amount (SA) register 

e Program Counter 


The C790 has 128-bit-wide General Purpose Registers (GPRs). The upper 64 bits of the 
GPRs are only used by the C790-specific “Quad Load/Store”, and “Multimedia (Parallel)” 
instructions. 


HIO and LOO are the standard 64-bit H/ and LO registers. H/1 and LOJ, which are the 
upper 64 bits of the 128-bit H/ and LO registers, are only used by the new multiply and 
divide instructions, such as MULT1, MULTU1, DIV1, DIVU1, MADD1, MADDU1, MFHI1, 
MFLO1, MTHI1, and MTLOI. All these instructions are equivalent to existing 
instructions which operate on H/0O and L OO registers. 


The Shift Amount (SA) register specifies the shift amount used by the funnel shift 
instruction. The shaded registers in Figure 4-1 are new architecturally-visible registers 
that are specific to the C790. 
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General Purpose Registers 
(127 64 63 0) 
63 0 63 


Program Counter 


Figure 4-1. CPU Registers 
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4.1.1. General Purpose Registers 


The standard 64-bit CPU general purpose registers have been extended to 128-bit 
registers. New instructions have been defined to use the upper 64-bits of these registers. 


Two of the CPU general purpose registers have special assigned functions: 


e r0is hardwired toa value of zero, and can be used as the target register for any 
instruction whose result is to be discarded. rO can also be used as a source when 
a zero value is needed. 


e r31isthelink register used by theJ ump and Link instructions. In general, it 
should not be used by other instructions. 


4.1.2 Hl and LO Registers 


The standard 64-bit H/ and LO registers have been extended to 128-bit registers. New 
instructions have been defined to use the upper 64-bits of these registers. H/O and LOO 


are the standard 64-bit H/ and LO registers. HI1 and LO1 are the upper 64 bits of the 
128-bit H/ and LO registers 


These four registers (H/0, LOO, H/1, LO1) store: 


e the product of integer multiply operations, or 
e theaccumulation of integer multiply-accumulate operations, or 


e the quotient (in LOO or LO1) and remainder (in HIO or HI1) of integer divide 
operations. 


4.1.3 Shift Amount (SA) Register 


The SA register specifies the shift amount used by the funnel shift instruction. This is a 
new architecturally-visible register and it needs to be saved and restored as part of the 
processor state. New instructions have been defined to move values between this register 
and the general purpose registers. 


4.1.4 Program Counter (PC) 


The Program Counter (PC) holds the address of the instruction which is being executed. 
The PC is incremented automatically by 4 when a non-control-transfer instruction (that is: 
branch, jump, ERET, SYSCALL, or TRAP) is executed. Control-transfer instructions 
change the value of the PC to the target address specified by them. An exception also 
changes the contents of the PC to the specified exception vector address. 
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4.2 System Control Coprocessor (COP0) Registers 
COPO registers are listed in Table 4-1. 


Table 4-1. Coprocessor 0 Registers 


No. Name 
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4.2.1. Index Register (0) 


31 30 6 5 0 
il 25 


Figure 4-2. Index Register 


The /ndex register is a 32-bit read/write register containing six bits to index an entry in 
the TLB. The high-order bit of the register records the success or failure of a TLB Probe 
(TL BP) instruction. 


The /ndex register also specifies the TLB entry affected by TLB Read (TLBR) or TLB 
Write Index (TLBW/) instructions. 


Table 4-2 shows the format of the /ndex register; Table 4-2 describes the /ndex register 
fields. 


Table 4-2. Index Register Field Description 


Description Type Initial 
Value 
Probe failure. Set to 1 when the previous TLB Probe Read/Write Undefined 
(TLBP) instruction was unsuccessful. 


a Index to the TLB entry affected by the TLB Read and Read/Write Undefined 
TLB Write instructions. 
30:6 Reserved. Must be written as zeroes, and returns zeroes Read-only 
when read. 
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4.2.2 Random Register (1) 


31 6 5 0 


ee 
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26 


Figure 4-3. Random Register 


The Random register is a read-only register. The least significant six bits index an entry 
in the TLB. This register decrements every cycle an instruction is executed. Its value 
ranges between an upper and a lower bound, as follows: 


e A lower bound is set by the number of TLB entries reserved for exclusive use by 
the operating system (the contents of the Wired register). 
e An upper bound is set by the total number of TLB entries (47 maximum). 


The Random register specifies the entry in the TLB that is affected by the TLB Write 
Random (TLBWR) instruction. The register does not need to be read for this purpose; 
however, the register is readable to verify proper operation of the processor. 


To simplify testing, the Random register is set to the value of the upper bound upon 
system reset. This register is also set to the upper bound when the Wired register is 
written. 


Figure 4-3 shows the format of the Random Register; Table 4-3 describes the Random 
Register fields. 


Table 4-3. Random Register Fields 


Description Type Initial 
Value 


Random 5:0 TLB Random index. Read-only Upper 
bound (47) 


31:6 Reserved. Must be written as zeros, and returns Read-only 
zeroes when read. 


4-7 


TX 
TOSHIBA Chapter 4 CPU and COPO Registers es” 


4.2.3 EntryLo0 Register (2), and EntryLo1 Register (3) 


EntryLo0o 
31 26 25 6 5 3 2 1 0 
Poo | i ' J vw 
6 20 3 1 1 1 
EntryLol 
31 26 25 6 5 3. 2 1 0 
a ee ee 
6 20 3 1 1 1 


Figure 4-4. EntryLo0O and EntryLo1 Registers 


The EntryLo0 and EntryLol registers consist of two registers that have similar format: 


e EntryLood is used for even virtual pages. 
e EntryLol is used for odd virtual pages. 


The EntryLo0 and EntryLol registers are read/write registers. They hold the physical 
page frame number (PFN) of the TLB entry for even and odd pages, respectively, when 
performing TLB read and write operations. 


Figure 4-4 shows the format of the EntryLo0 and EntryLol Registers; Table 4-4 describes 
the EntryL o0 and EntryL o1 Register fields. 


Table 4-4. EntryLo0 and EntryLo1 Register Fields 


Description Type Initial 
Value 


Page frame number; the upper bits of the physical address. Read/Write Undefined 


Specifies the TLB page coherency attribute. Read/Write Undefined 
000(0): Reserved 
: Reserved 
: Uncached 
: Cacheable, write-back, write allocate 
: Reserved 
: Reserved 
: Reserved 


Dirty. If this bit is set, the page is marked as dirty and therefore Read/Write Undefined 
writable. This bit is actually a write-protect bit that software can use 
to prevent alteration of data. 
Valid. If this bit is set, it indicates that the TLB entry is valid; Read/Write Undefined 
otherwise, a TLBL or TLBS miss will occur. 
G Global. If this bit is set in both EntryLo0 and EntryLo1, then the Read/Write Undefined 
processor ignores the ASID during TLB look-up. 
31:26 |Reserved. Must be written as zeroes, and returns zeroes when Read-only 
read. 
EntryLo0[31] is reserved for Kernel use. It contains the written 
value. This bit has no effect on any CPU or TLB operation. 


Reserved codes in C field may not be written correctly into TLB entry by TLBWI or 
TLBWR instruction. 
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4.2.4 Context Register (4) 


31 23 22 4 3 0 
PTEBase BadVPN2 fa | 
9 19 4 


Figure 4-5. Context Register Format 


The Context register is a read/write register containing the pointer to an entry in the page 
table entry (PTE) array. This array is an operating system data structure that stores 
virtual-to-physical address translations. When there is a TLB miss, the CPU loads the 
TLB with the missing translation from the PTE array. Normally, the operating system 
uses the Context register to address the current page map which resides in the kernel- 
mapped segment, kseg3. The Context register duplicates some of the information provided 
in the BadVAdar register, but the information is arranged in a form that is more useful 
for a software TLB exception handler. Figure 4-5 shows the format of the Context register; 
Table 4-5 describes the Context register fields. 


Table 4-5. Context Register Fields 
Description Type Initial 
Value 


PTEBase 31:23 This field is a read/write field for use by the operating Read/Write Undefined 
system. It is normally written with a value that allows the 
operating system to use the Context register as a pointer 
into the current PTE array in memory. 


BadVPN2 22:4 This field is written by hardware on a miss. It contains the Read-only Undefined 
virtual page number (VPN) of the most recent virtual 
address that did not have a valid translation. 
3:0 Reserved. Must be written as zeros, and returns zeroes Read-only 
when read. 


The 19-bit BadVPN2 field contains bits 31:13 of the virtual address that caused the TLB 
miss; bit 12 is excluded because a single TLB entry maps to an even-odd page pair. For a 4 
KB page size, this format can directly address the pair-table of 8-byte PTEs. For other 
page and PTE sizes, shifting and masking this value produces the appropriate address. 
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4.2.5 PageMask Register (5) 


31 2524 1312 0 


12 13 


Figure 4-6. PageMask Register 


The PageMask register is a read/write register used for reading or writing the TLB. It 
holds a comparison mask that sets the variable page size for each TLB entry, as shown in 
Table 4-6. 


Table 4-6. PageMask Register Field 


| Field | Bits | Description | Type | Initial Value | 


MASK 24:13 Page comparison mask. Read/Write Undefined 

0000 0000 0000: Page Size = 4 Kbytes 

0000 0000 0011: Page Size = 16 Kbytes 

0000 0000 1111: Page Size = 64 Kbytes 

0000 0011 1111: Page Size = 256 Kbytes 

0000 1111 1111: Page Size = 1 Mbytes 

0011 1111 1111: Page Size = 4 Mbytes 

1111. 1111. 1111: Page Size = 16 Mbytes 


31:25, Reserved. Must be written as zeros, and returns zeroes | Read-only 
12:0 when read. 


TLB read and write operations use this register as either a source or a destination; when 
virtual addresses are presented for translation into physical address, the corresponding 
bits in the TLB identify which virtual address bits among bits 24:13 are used in the 
comparison. When the Mask field is not one of the values shown in Table 4-6, the 
operation of the TLB is undefined. 
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4.2.6 Wired Register (6) 


Figure 4-7. Wired Register 


The Wired register is a read/write register that specifies the boundary between the wired 
and random entries of the TLB as shown in Figure 4-8. Wired entries are fixed, non- 
replaceable entries which cannot be overwritten by a TLB write operation. Random 
entries can be overwritten. Figure 4-7 shows the format of the Wired register. Table 4-7 


describes the register fields. 
The Wired register is set to 0 upon system reset. Writing this register also sets the 
Random register to the value of its upper bound as shown in Figure 4-8. 


TLB 


47 


Random 
entries 


<+—— Wired Register 
value 
Wired entries 


Figure 4-8. Wired Register Boundary 


Writing a value greater than 47 into this register produces undefined results. 


Table 4-7. Wired Register Field Descriptions 
| Field | Bits | Description | Type _| Initial Value 


TLB Wired boundary (the number of wired TLB Read/Write Le oi 
entries) 


31:6 Reserved. Must be written as zeros, and returns Read-only 
zeroes when read. 
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4.2.7 BadVAddr Register (8) 


31 0 


BadVAddr 


32 


Figure 4-9. BadVAddr Register 


The Bad Virtual Address register (BadVAddr) is a read-only register that displays the 
most recent virtual address that caused one of the following exceptions: TLB Invalid, TLB 
Modified, TLB Refill, or Address Error exceptions. 


Figure 4-9 shows the format of the BadVAdor register; Table 4-8 describes the register 
fields. 


Table 4-8. BadVAddr Register Field 


Description Type Initial 
Value 


BadVAddr 31:0 The most recent virtual address that cause a TLB Invalid, Read-only Undefined 
TLB modified, TLB Refill, or Address Error exception. 


Note: The BadVAadoar register does not save any information for bus errors, since bus 
errors are not addressing errors. 
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4.2.8 Count Register (9) 


Figure 4-10. Count Register 
The Count register acts as a real-time timer. It is incremented every CPU clock cycle. The 
timer interrupt signaled through /P[7] can be disabled through the interrupt mask bit, 
!M[7]. This register can be read or written. 


Figure 4-10 shows the format of the Count register. Table 4-9 describes the register fields. 


Table 4-9. Count Register Field 
| Field | Bits | Description | = Type _ [Initial Value 


32-bit timer, incrementing at the CPU clock rate. Read/Write Undefined 
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4.2.9 EntryHi Register (10) 


31 13 (12 8 7 0 
19 


Figure 4-11. EntryHi Register 


The EntryHi register holds the high-order bits of a TLB entry for TLB read and write 
operations. The EntryHi register is accessed by the TLB Probe TLB Write Random, TLB 
Write Indexed, and TLB Read | ndexed instructions. 


When either a TLB Refill, TLB Invalid, or TLB Modified exception occurs, the EntryHi 
register is loaded with the virtual page number (VPN2) and the ASID of the virtual 
address that did not have a matching TLB entry. 


Figure 4-11 shows the format of the EntryHi register. Table 4-10 describes the register 
fields. 


Table 4-10. EntryHi Register Fields 


| Field | Bits | Description Type Initial Value 
VPN2 31:13 Virtual page number divided by two (maps to two Read/Write Undefined 
pages). 


ASID 7:0 Address space ID field. An 8-bit field that lets multiple Read/Write Undefined 
processes share the TLB; each process can have a 
distinct mapping of otherwise identical virtual page 
numbers. 


12:8 Reserved. Must be written as zeroes, and returns Read-only 
zeroes when read. 
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4.2.10 Compare Register (11) 


31 0 
Compare 
32 


Figure 4-12. Compare Register 


The Compare register acts as a timer (See also the Count register); it maintains a stable 
value that does not change on its own. When the value of the Count register equals the 
value of the Compare register, interrupt bit |P[7] in the Cause register is set. This causes 
an interrupt as soon as the interrupt is enabled. Writing a value to the Compare register, 
as a side effect, clears the timer interrupt. 


For diagnostic purposes, the Compare register is a read/write register. In normal use, 
however, the Compare register is write-only. Figure 4-12 shows the format of the Compare 
register. Table 4-11 describes the register fields. 


Table 4-11. Compare Register Field 


Value 


Compare 31:0 The Compare register saves a stable value compared to the Read/Write | Undefined 
Count register. When the value of the Count register equals to 
the value of the Compare register, interrupt IP[7] occurs. 
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4.2.11 Status Register (12) 
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Figure 4-13. Status Register 


The Status register (SR) is a read/write register that contains the operating mode, 
interrupt enabling, and the diagnostic states of the processor. Figure 4-13 shows the 
format of the Status register. The following paragraphs identify the more important 
Status register fields and describe the fields. Some of the important fields include: 


e The 3-bit Interrupt Mask (IM) field controls the enabling of three interrupt 
signals. Interrupts must be enabled before they can be asserted. Interrupts are 
recognized by the processor when the corresponding bits are set in both the 
Interrupt Mask and the Interrupt Enable fields of the Status register and the 
Interrupt Pending field of the Cause register. The C790 does not support 
software interrupts. |M[7] corresponds to the internal timer interrupt and 
IM[3:2] corresponds to|nt[1:0] signals. 

e The 4-bit Coprocessor Usability (CU) field (CU[3:0]) controls the usability of four 
possible coprocessors. Regardless of the CU[0] bit setting, COPO is always 
usable in Kernel mode. For all other cases, an access to an unusable coprocessor 
causes an exception. C790 supports coprocessor 1 (FPU). 


4-16 


TOSHIBA 


TX 
Chapter 4 CPU and COPO Registers es” 


4.2.11.1 Status Register Format 


Table 4-12 describes the Status register fields. All bits in the Status register are readable 


and writable. 
Table 4-12. Status Register Fields 
Field Description Type | Initial 
Value 
CU 31:28 
(CU[3:0]) 
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CU[0] bit. 
1 — usable 
0 —> unusable 

Enable additional floating point registers Read/ 
0 > 16 registers Write 
1 — 32 registers 


Controls the location of Performance counter and debug/SIO exception Read/ | Undefined 
vectors. Write 


0 > normal 
1 — bootstrap 

Controls the location of TLB refill and general exception vectors. Read/ 
0 > normal Write 
1 — bootstrap 


Cache Hit (tag match and valid state) or Miss indication for last CACHE Hit | Read/ | Undefined 
Invalidate and CACHE Hit Write-back Invalidate for the Data cache. Write 
0 —> miss 
1 = hit 
El/DI instruction Enable: When this bit is set, the El and DI instructions Read/ | Undefined 
can operate in User, Supervisor and Kernel modes and as such set or clear} Write 
the E/E bit to enable or disable all interrupts (except NMI). When this bit is 
cleared, El and DI operate as NOPs in User and Supervisor modes and 
executes properly in Kernel mode. 
Enable IE: This bit enables or disables the IE (Interrupt Enable) bit. This Read/ | Undefined 
bit is cleared by the DI instruction and set by the El instruction. Write 


Controls the usability of each of the four coprocessor unit numbers. COPO Read/ | Undefined 
is always usable when in Kernel mode, regardless of the setting of the Write 


0 — disables all interrupts regardless of the value of the /E bit. 
1 — enables the /E bit. (All interrupts are enabled if /E=1, EXL=0, and 
ERL=0.) 


Note: IM enables individual interrupt 


Interrupt Mask: controls the enabling of each of the external and internal Read/ | Undefined 
interrupts. An interrupt is taken if interrupts are enabled, and the Write 
corresponding bits are set in both the Interrupt Mask field of the Status 
register and the Interrupt Pending field of the Cause register. 
0 — disabled 
1 — enabled 
Note: The enabling of this bit is valid only when ElE=1, IE=1, EXL=0 and 
ERL=0 
Bus Error Mask: controls the updating of the BadPAddr register and Read/ | Undefined 
signaling a bus error exception. Write 
0 > update BadPAddr and signal a bus error exception. 
1 — do not update BadPAdadr and stop signaling a bus error 
exception. This bit is set to 1 when it is a 0 and a bus error is signaled. 
Kernel/Supervisor/User Mode bits: Read/ | Undefined 
002 — Kernel Write 
012 — Supervisor 
102 — User 
112 — Reserved 
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Description Type | Initial 
Value 


Error Level: set by the processor when Reset, NMI, performance counter, Read/ 1 
SIO or debug exception is taken. Write 
0 > normal 1 > error 
Exception Level: set by the processor when any exception other than Read/ | Undefined 
Reset, NMI, performance counter, or debug exception is taken. Write 
0 > normal 1 > exception 
Interrupt Enable Read/ | Undefined 
0 > disables all interrupts Write 
1 > enables all interrupts (if EIE=1, ERL=0, and EXL=0) 
0 


Reserved. Must be written as zeroes, and returns zeroes when read. 


4.2.11.2 Status Register Modes and Access States 
Fields of the Status register set the modes and access states below. 


Interrupt Enable: Interrupts are enabled when all of the following conditions are true: 
Status.IE =1, 

and Status.EIE =1, 

and Status.EXL =0, 

and Status.ERL =0 


If these conditions are met, setting the /M bits enable the appropriate interrupts. 


SIO Enable: A level 2 exception by SIO is enabled when the following condition is true: 

e Status.ERL =0 
If this condition is met, asserting the SIO signal causes a Debug exception to occur. 
Operating Modes: The following CPU Status register bit settings are required for User, 
Kernel, and Supervisor modes. 


e The Processor is inUser mode when KSU =102 andEXL =O andERL =0. 
e The processor is in Supervisor mode when KSU =012 andEXL =OandERL =0. 
e The processor is inKernel mode when KSU =002 orEXL =1orERL =1. 


Kernel Address Space Accesses: Access to the kernel address space is allowed when the 
processor is in Kernel mode. 


Supervisor Address Space Accesses: Access to the supervisor address space is allowed 
when the processor is in Kernel mode or Supervisor mode, as described above. 


User Address Space Accesses: Access to the user address space is allowed in Kernel, 
Supervisor, and User modes. 
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4.2.12 Cause Register (13) 
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Figure 4-14. Cause Register 


The 32-bit read-only Cause register describes the cause of the most recent exception. 
Figure 4-14 shows the fields of this register. Table 4-13 describes the Cause register fields. 
All bits in the Cause register are read-only. 


Table 4-13. Cause Register Fields 


Description Type Initial 
Value 


31 Set by the processor when any exception other than Reset, NMI, Read-only | Undefined 
performance counter, or debug occurs and is taken in a branch delay 
slot. 
1 — delay slot 
0 > normal 
BD2 30 Indicates whether the last NMI, performance counter, debug, or SIO | Read-only | Undefined 
exception taken occurred in a branch delay slot. 
1 — delay slot 
0 > normal 


29:28 Coprocessor unit number referenced when a Coprocessor Unusable | Read-only | Undefined 
exception is taken. 


EXC2 18:16 Indicates the exception codes for level 2 exceptions (Performance Read-only | Undefined 
Counter, Reset, Debug, SIO and NMI exceptions) 
000 (0) : Res (Reset) 
001 (1): = NMI (Non-maskable Interrupt) 
010 (2):  PerfC (Performance Counter) 
011 (3): | Dbg (Debug) and SIO (SIO) 
1xx (4-7) :_ Reserved 


IP[7,3:2] 15, Indicates an interrupt is pending. Read-only | Undefined, 
11:10 1 = interrupt pending Int[1:0] 
0 > no interrupt 
SIOP 12 Indicates an SIO signal is pending Read-only 
1 > SIO signal is pending 
0 > no SIO signal is pending 
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Description Type | _ Initial 
a a a  *C- 


ExcCode Exception code filed. Read- | Undefined 
00000 : Int (Interrupt) only 
00001 : Mod (TLB modification exception) 

00010 : TLBL (TLB exception (load or instruction fetch)) 
00011 : TLBS (TLB exception (store)) 
00100 : AdEL (Address error exception 
(load or instruction fetch)) 

00101 : AdES (Address error exception (store)) 
00110 : IBE (Bus error exception (instruction fetch)) 
00111 : DBE (Bus error exception 

(data reference: load or store)) 
01000 : Sys (Syscall exception) 
01001 : Bp (Breakpoint exception) 
01010 : RI (Reserved instruction exception) 
01011 : CpU(Coprocessor Unusable exception) 
01100 : Ov (Arithmetic overflow exception) 
01101 : Tr (Trap exception) 
01110 : Reserved 
01111 (15): FPE Floating-Point exception 

(16-31): (Reserved) 


Reserved. Must be written as zeroes, and returns zeroes when read. | Read- i~ | 
only 
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4.2.13 EPC Register (14) 
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Figure 4-15. EPC Register 


The Exception Program Counter (EPC) is a read/write register that contains the address 
at which processing resumes after an exception has been serviced. 


For synchronous exceptions, the EPC register contains either: 
e thevirtual address of the instruction that was the direct cause of the exception, 
or 
e thevirtual address of the immediately preceding branch or jump instruction 
(when the instruction is in a branch delay slot, and the BD bit in the Cause 
register is set). 


On the occurrence of an exception, if the EXL bit in the Status register is set toa 1, the 
processor does not update the EPC register. Figure 4-15 shows the format of the EPC 
register. Table 4-14 describes the EPC register fields. 


Table 4-14. EPC Register Field 


EPC 31:0 Contains the address at which processing can resume after an Read/Write Undefined 
exception has been serviced. 
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4.2.14 PRId Register (15) 


31 16 (15 8 7 0 


Figure 4-16. PRid Register 


The 32-bit read-only Processor Revision Identifier (PRId) register contains information 
identifying the implementation and revision level of the C790 and COPO. Figure 4-16 
shows the format of the PR/d register; Table 4-15 describes the PR/d register fields. 


The low-order byte (bits 7:0) of the PR/d register is interpreted as a revision number, and 
the high-order byte (bits 15:8) is interpreted as an implementation number. The 
implementation number of the C790 processor is 0x38. The content of the high-order 
halfword (bits 31:16) of the register are reserved. 


The revision number is stored as a value in the form y.x, where yis major revision number 
in bits 7:4 and xis a minor revision number in bits 3:0. 


The revision number can distinguish some chip revisions, but there is no guarantee that 
changes to the chip will necessarily be reflected in the PR/d register, or that changes to 
the revision number necessarily reflect real chip changes. For this reason, these values are 
not listed and software should not rely on the revision number in the PR/d register to 
characterize the chip. 


Table 4-15. PRIid Register Fields 


Description Type Initial 
Value 


Implementation number Read-only 


Rev 7:0 Revision number of each mask Read-only | Revision 
number 


= a 31:16 Reserved. Must be written as zeroes, and returns zeroes when read. | Read-only i 
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4.2.15 Config Register (16) 
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Figure 4-17. Config Register Format 


mo 
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The Config register specifies various configuration options which can be selected. Figure 4- 
17 shows the format of the Config register; Table 4-16 describes the Config register fields. 


Some configuration options, as defined by Config bits 30:28, 15 and 11:6, are set by the 
hardware during reset and are included in the Config register as read-only status bits for 
the software to access. Other configuration options like 18:16 and 13:12 are set by 
hardware during reset and can be modified by software. Other configuration options like 
bits 2:0 are read/write and controlled by software; on reset these fields are undefined. 


Table 4-16. Config Register Fields 
Boy Initial 
Description Type 
i yP Value 
30:28 Bus clock ratio. Read-only 
000: processor clock frequency divided by 2 
001 ~ 111: (Reserved) 
Double issue enable Read/Write 
0 = Single issue 1 — Double issue 


Setting this bit to 1 enables the instruction cache. Read/Write 
0 - Instruction cache disable 
1 — Instruction cache enable 

The CACHE instruction for the instruction cache is enabled 

regardless of the value of this bit. 

Setting this bit to 1 enables the data cache. Read/Write 
0 —> Data cache disable 
1 — Data cache enable 

If the cache is disabled, the PREF instruction becomes a NOP. 


0 = Little Edian 1 — Big Edian 
Setting this bit to 1 enables non-blocking load. Read/Write 
0 — Disable Non-blocking loads and hit under miss 
1 — Enable Non-blocking loads and hit under miss 
Setting this bit to 1 enables branch prediction. Read/Write 
0 — Disable Branch Prediction 
1 — Enable Branch Prediction 
11:9 Instruction cache Size (Instruction cache size = 2'?*'° bytes). paeeoe eer) 
011 + 32 KB 
EBaEs Data cache Size (Data cache size = plesDe bytes). pees |) ae | 
011 > 32 KB 
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Initial 
CS a a 


kseg0 coherency algorithm. Read/Write | Undefined 
000: Reserved 

001: Reserved 

010: Uncached 

011: Cacheable, write-back, write allocate 

100: Reserved 

101: Reserved 

110: Reserved 

111: Uncached Accelerated 


Reserved, Must be written as zeroes, and returns zeroes when Read-only ~] 


With single issue enabled (DIE =0), the C790 always fetches two instructions but only 
issues a single instruction. 
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4.2.16 BadPAddr Register (23) 


31 4 3 0 


BdPAddr Vo 
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Figure 4-18. BadPAddr Register Format 


The Bad Physical Address register (BadPAddr) is a read-only register that contains the 
most recent physical address that caused a bus error. It is updated with a new value 


whenever Status.BEM is clear (0). Once this bit is set (on the occurrence of a bus error) 
the register holds the value. 


Figure 4-18 shows BadPAdar register format; Table 4-17 describes the register fields. 


Table 4-17. BadPAddr Register Fields 


Description Type Initial 
Value 


BdPAddr Physical Address value Read-Only 
Reserved. Returns zeros when read. Read-Only 
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4.2.17 Debug Registers (24) 


There are seven separately addressable debug registers, which are all assigned to CPO, 
register 24. 


Each of the seven registers is accessed by specifying subaccess code which is bit2 to bitO of 
an instruction code. 


Breakpoint Conirol Register (BPC) (subaccess code 0) 


91 BO) 220. 28987 96. 25) BA BF: OR OT 20) “Gs AG: 17, AOS 161A Boe. Tie 20 
1;D|D{D Py ryt D/D|D/D;1!1|D/B D,/D| 1 
A;R|W\/V;/O0;U;|S;/K;\E;O;/U;S|K;/X/T|T)/E;/O|}W)R/A 
E/|E;/E/E E,E;E E,;E;E;/E|E|E/D Bi B|B 


See Table 13-3 for a detailed description of individual BPC register fields. 
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Instruction Address Breakpoint (IAB) (Subaccess code 2) 


31 21 0 
30 2 


Instruction Address Breakpoint Mask Register (IABM) (subaccess code 3) 


31 21 0 
30 2 


Data Address Breakpoint Register (DAB) (subaccess code 4) 


wo 
pair 
oO 


DAB 
32 


Data Address Breakpoint Mask Register (DABM) (subaccess code 5) 


31 


DABM 


32 


oO 


Data value Breakpoint Register (DVB) (subaccess code 6) 


(oe) 


{ 


oO 


DVB 
32 


Data value Breakpoint Mask Register (DVBM) (subaccess code 7) 


ao 
a 
oO 
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4.2.18 Performance Counter Registers (25) 


There are three separately addressable performance counter registers, which are all 
assigned to COPO, register 25. 

Each of the three registers is accessed by specifying subaccess code which is bit1 to bitO of 
an instruction code. 

All performance counter registers are read/write registers. 


10 9 54 3 2 
1 5 1 1 1 


Performance Counter Control Register (PCCR) 
31 30 20 19 15 14 13 12 11 


Cc 

T 

E EVENT1 

1 11 5 1. he 


Performance Counter Register 0 (PCRO) 


31 30 0 
O 

Vv 

F VALUE 

L 

1 31 

Performance Counter Register 1 (PCR1) 

3130 0 
O 

Vv 

F VALUE 

L 

1 31 


Figure 4-19. Performance Counter Registers 
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Table 4-18 lists the field definitions for the Performance Counter Control register. 


Table 4-18. Performance Counter Control Register Fields 


EVENT1 19:15 


= a 


_ 
mf 


EXL1, EXLO 
30:20, Reserved. Must be written as zero, and returns zero when Read-only 
10, read. 
0 


Enables event counting (CTR1, CTRO) and exception Read/Write 
generation: 
0 = Disable 1 > Enable 


Set the event to be monitored by PCR1 Read/Write Undefined 
00000 Low-order branch issued 
00001 Processor cycle 
00010 Dual instruction issue 
00011 Branch miss predicted 
00100 TLB miss 
00101 DTLB miss 
00110 Data Cache miss 
00111 WBB single request unavailable 
01000 WBB burst request unavailable 
01001 WBB burst request almost full 
01010 WBB burst request full 
01011 CPU data bus busy 
01100 Instruction completed 
01101 Non-BDS instruction completed 
01110 COP‘1 instruction completed 
01111 Store completed 
10000 No event 
(17-31) Reserved 
Set the event to be monitored by PCRO Read/Write Undefined 
00000 (0) Reserved 
00001 Processor cycle 
00010 Single instruction issue 
00011 Branch issue 
00100 BTAC miss 
00101 ITLB miss 
00110 Instruction Cache miss 
00111 DTLB accessed 
01000 Non-blocking load 
01001 WBB single request 
01010 WBB burst request 
01011 CPU address bus busy 
01100 Instruction completed 
01101 Non-BDS instruction completed 
01110 Reserved 
01111 Load completed 
10000 No event 
(17-31) Reserved. 


Enables event counting (PCR1/PCRO) in the User mode. Read/Write Undefined 
0 = Disable 1 > Enable 


Enables event counting (PCR1/PCRO) in the Supervisor 
mode. 

0 = Disable 1 > Enable 

Enables event counting (PCR1/PCRO) in the Kernel mode. 
0 — Disable 1 — Enable 


Enables event counting (PCR1/PCRO) when EXL bit is set | Read/Write Undefined 
in the Status register. 
0 — Disable 1 > Enable 
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oO 
a= 
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Table 4-19 lists the field definitions for the Performance Counter register 0(PCRO). 


Table 4-19. Performance Counter Register 0 Fields 


| Field | Bits | Description Type [Initial Value 
OVFL 31 Overflow flag Read/Write Undefined 
VALUE 30:0 The actual counter Read/Write Undefined 


Table 4-20 lists the field definitions for the Performance Counter register 1 (PCR 1). 


Table 4-20. Performance Counter Register 1 Fields 


| Field | Bits [| Description S| Type _ [Initial Value| 


OVFL Overflow flag Read/Write | Undefined 
VALUE Read/Write | Undefined 
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4.2.19 TagLo (28) and TagHi (29) Registers 


TagLo 


31 
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7 6 5 4 3. 2 #0 
PTagLe eee. els al 
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TagHi 


31 
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Special use 


32 


Figure 4-20. TagLo and TagHi Registers 


The TagLo and TagHi registers are 32-bit read/write registers used by the CACHE 
instruction. For writing to the data cache tags, the TagLo register contains the fields as 
shown above and the TagHi register is not used. For writing to the data cache data portion 
the TagLo register contains the data value. For writing to the instruction cache tags the 
TagL o register contains the fields as defined above except that bits three and six are also 
reserved bits. For writing to the instruction cache data portion, the TagLo register 
contains the data (instruction) and the TagH/ register contains the steering bits and bits 
for the BHT as defined in Chapter 7. When reading from the caches, the values in the 
TagLoand TagHi register are the same as described above for writing. These registers are 
also used for manipulating the BTAC. See the description of the CACHE instruction in 
Appendix C for details. Figure 4-20 shows the format of these registers for some of the 
cache operations. 
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Table 4-21 lists the field definitions of the TagL oregister. 


Table 4-21. TagLo Register Fields 


Description Type Initial 
Value 
PTagLo 31:12 PTagLo[31:12] specifies 20-bit physical address tag cache. Read/Write | Undefined 
[31:12] 


Dirty: Read/Write | Undefined 
0 > Clean 
1 = Dirty 


Valid: Read/Write | Undefined 


0 = Invalid 


1 = Valid 


LRF Replacement: This bit participates in the calculation Read/Write | Undefined 
determining which cache way will be used for the next 
replacement. See Section 7.3.1 for details. 


Lock: This bit is only used for the data cache. For instruction Read/Write | Undefined 
cache operations this bit is treated as a reserved bit. 


0 — For this line, this side is not locked. 
1 — For this line, this side is locked. 


Special |11:7, 2:0] Used by the CACHE instruction to manipulate the branch target Read/Write | Undefined 
use, Su address cache. Refer to Chapter 7 for details. 


Table 4-22. TagHi Register Fields 


Description Type Initial 
Value 


Special use| 31:0 The TagHi register is used by the CACHE instruction to manipulate] Read/Write | Undefined 
some of the bits of the instruction cache. Refer to Chapter 7 for 
details. 
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4.2.20 ErrorEPC (30) 


31 0 


ErrorEPC 


32 


Figure 4-21. ErrorEPC Register 


The ErrorEPC register is similar to the EPC register, except that ErrorEPC is used on 
nonmaskable interrupt (NMI), debug, SIO, and performance counter exceptions. 


The read/write ErrorEPC register contains the virtual address at which instruction 
processing can resume after servicing an error. This address can be: 


e thevirtual address of the instruction that caused the exception 
e thevirtual address of the immediately preceding branch or jump instruction 


(when the instruction is in a branch delay slot, and the BD2 bit in the Cause 
register is set). 


Table 4-23 lists the field definition of the ErrorEPC register. 
Table 4-23. ErrorEPC Register Field 


| Field | Bits | Description S| Type _| Initial Value 


ErrorEPC 31:0 Contains the virtual address at which instruction Read/Write Undefined 
processing can resume after servicing an error. 


4-33 


TX 
TOSHIBA Chapter 4 CPU and COPO Registers es” 


4-34 


TX 
TOSHIBA Chapter 5 Exception Processing and Reset Sie 


5. Exception Processing and Reset 


This chapter describes the exception processing, including level 1 and level 2 exceptions. 


TX 
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5.1 The Exception Handling Process 


Exceptions can be recognized while the program is any of its three operating modes: User, 
Supervisor, or Kernel. 


Exceptions are categorized into 2 groups which are level 1 exceptions and level 2 
exceptions as shown in Table 5-1. 


Table 5-1. Exception Levels 


Interrupt Reset 

TLB Modified NMI 

TLB Refill Performance Counter 
TLB Invalid Debug 

Address Error SIO 

Syscall 


Break 

Trap 

Reserved Instruction 
Coprocessor Unusable 
Integer Overflow 

Bus Error 

Floating Point Exception 


Compatibility Note: Level 2 exceptions are a generalization of “error level” exception 
processing defined in earlier MIPS implementation. 


5.1.1. Level 1 Exceptions 
Exception Processing 


When the processor takes a level 1 exception, the processor switches to Kernel mode. 
Rather than set the Status.KSU bits to effect the switch, the Status.EXL bit is set to 1. 
Whenever Status.EXL is 1, the operating mode is Kernel mode, regardless of the setting of 
Status.K SU. 


Then the processor saves the virtual address of the instruction canceled by the exception. 
This address is saved in the EPC register. If the canceled instruction is in the delay slot of 
a branch instruction, the CauseBD bit is set to 1 and EPC is set to the address of the 
branch instruction (rather than the delay slot). For non-delay-slot instructions, Cause BD 
is set to 0. If Status.EXL bit was 1 before the exception is taken, EPC and Cause BD 
aren't set. The exception service routine examines Cause BD to determine the true 
address of the instruction that raised the exception. 


In addition to setting EPC, Cause BD, and Status.EXL, the 5 bit field CauseExcCade is 
also set. This field specifies the cause of the exception; The CauseCE fields may also get 
set when an Coprocessor unusable exception is raised. 


After setting those bits, the processor jumps to the exception vector address. 
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The basic exception handling operation performed can be described using the Figure 5-1 
Level 1 Exception Processing Flowchart. 


(see next page) 
Disabled exceptions in level 1 exception handler 


Once a level 1 exception service routine is entered, interrupts and bus error are 
unconditionally disabled. 


C790 Programming Note: The only level 1 exception that is unconditionally 
disabled within level 1 exceptions handler is external interrupts and bus errors. 
All other level 1 exceptions still occur and are recognized (if enabled). a software 
system that makes use of such exceptions must use extreme care. In particular, 
it must make sure that it has saved EPC and Cause.BD somewhere (eg. in a 
software managed stack) before the exception occurs. 
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Set Cause.ExcCode 

Cause.CE < coprocessor number when CpU exception 
Set BadVAdar when AdES, AdEL or any TLB exception 
Set Context and EntryHi when any TLB exception 

Set BadPAdoar when Bus Error 


YES 


Instr.in 
Br.Dly.Slot ? 


EPC — PC-4 EPC < PC 
Cause.BD < 1 Cause.BD <— 0 


Status.EXL < 1 


=TLB oa Interrupt 


Offset <— 0x0 Offset — 0x180 Offset < 0x200 


= 0 (normal) 1 (bootstrap) 


PC < 0x8000 0000+Offset PC < OxBFCO 0200+Offset 


Offset — 0x180 


Figure 5-1. Level 1 Exception processing flowchart 
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5.1.2 Level 2 Exceptions 
Exception Processing 


When the processor takes a level 2 exception, the processor switches to kernel mode, by 
setting Status.ERL tol. 


The address of the instruction where the Level 2 exception was recognized is stored in the 
ErrorEPC register. If the canceled instruction is in the delay slot of a branch instruction, 
the Cause. BD2 bit is set to 1 and ErrorEPC is set to the address of the branch instruction 
(rather than the delay slot). For non-delay-slot instructions, Cause BD2 is set to 0. In 
addition, the cause of the exception is stored in Cause EXC2. 


After setting those bits, the processor jumps to the exception vector address. 


The basic Level 2 exception handling operation performed can be described using the 
Figure 5-2 Level 2 Exception processing Flowchart. 


(see next page) 
Disabled Exceptions in level 2 exceptions 


When executing a Level 2 exception service routine, following exceptions are disabled. 


e NMI, Interrupt, and Bus error 
e Debug, SIO and Performance counter 


C790 Implementation Note: Any external exception that is not level-sensitive (e.g. 
NMI) must be held until it is recognized; i.e. at least until the Level 2 handler is 
exited. 


C790 Programming Note: It is the programmer’s responsibility to ensure that all 
other internal exceptions (e.g. OVERFLOW) never occur within a Level 2 handler. 
If they do occur, the corresponding Level 1 exception handler will be entered. 
Since both Status.EXL and Status.ERL will be set when servicing this (nested) 
exception, the ERET used to exit the service routine will operate incorrectly. 


C790 Programming Note: When Status.ERL =1, the user address, Kuseg, region 
becomes a 23!-byte unmapped, uncached address space (that is, mapped directly 
to physical address 0x0000 0000-Ox7F FF FFFF). 
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Set Cause.EXC2 


Instr.in 
Br.Dly.Slot ? 


YES 


ErrorEPC <— PC-4 ErrorEPC — PC 
Cause.BD2 <— 1 Cause.BD 2< 0 


Status.ERL < 1 


= Performance Counter 


= Reset or NMI 


Status.BEV <— 1 


= Debug or SIO 


Offset <— 0x100 Offset <— 0x80 


Status.BEM — 0 

Config. DIE/ICE/DCE <0 
Config.NBE/BPE <— 0 
Random <— 47 

Wired — 0 

PCCR.CTE <0 
BPC.IAE/DRC/DWE < 0 


PC < OxBFCO 0000 


= 0 (normal) = 1 (bootstrap) 


PC < 0x8000 0000+Offset PC < OxBFCO 0200+Offset 


Figure 5-2. Level 2 Exception processing flowchart 
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5.2 Exception Vector Locations 


Exception vector addresses for level 1 exceptions are shown in Table 5-2. 
The vector address for TLB refill depends on the Status.EXL bit. The vector addresses for 
level 1 exceptions also depend on the Status.BE V bit. 


Table 5-2. Exception Vectors for Level 1 exceptions 


Vector Address 
BEV =0 BEV =1 


TLB Refill (EXL = 0) 0x8000 0000 | 0xBFCO 0200 
TLB Refill (EXL = 1) 


Exception vector addresses for level 2 exceptions are shown in Table 5-3. 
The vector addresses for level 2 exceptions also depend on the Status.DE V bit. 


Table 5-3. Exception Vectors for Level 2 exceptions 


Vector Address 
DEV =0 DEV =1 


Reset, NMI 0xBFCO 0000 | 0xBFCO 0000 
0x8000 0080 | 0xBFCO 0280 
Debug, SIO 0x8000 0100 | 0xBFCO 0300 
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5.3 Cause Register Setting 


The Cause.E xcCode bits are set when a level 1 exception is taken. 
The Cause.E xcCode setting is shown in Table 5-4. 


Table 5-4. Cause.ExcCode Field 


|ExcCode | CE Xception — 
| 0 | Int (Interrupt) 


| 6 __| IBE (Bus error exception; instruction fetch) 


7 DBE (Bus error exception; load or store) 
8 Sys (Syscall exception) 
[9 | Bp (Breakpoint exception) —=SS~S~S~S~S 


The Cause.EXC2 bits are set when a level 2 exception is taken. 
The Cause.E XC2 setting is shown in Table 5-5. 


Table 5-5. Cause.EXC2 Field 


Exc2 | CE Xception 
| 0 Res (Reset exception) 
NMI (Non-Maskable Interrupt) 


Dbg (Debug exception), SIO (SIO exception) 
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5.4 Masking an exception 
The following exceptions can be masked by setting bits in Status register. 
NMI, Performance counter, Debug, Bus error, Interrupt and SIO 


The Table 5-6 shows whether the bits mask those exceptions. Exceptions which marked 
with “X” can be masked by setting (BEM, EXL or ERL) or clearing (IE or IM) the 
corresponding bit in the Status register. 


Table 5-6. Masking exceptions 


|, IE | iM_| Bem | Ext | ER | 
ae a eee ae ee 
ae (CR (ee ee ee (oS 
|Performance Counter | | | TX 
Debug CT TX 
Se ee (a (ee 
JAddresserror | | TT 
|TLB RefilvinvalidModify | | | 
|Buserror CT TX TX | CX 
[syscall CT 
jBreak CT 
|Reservedinstrcution | | | 
|Coprocessor Unusable | | | TT 
interrupt CT XT XTX | CX 
Integer overflow | | 
Trp 
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5.5 Detaild Description 
5.5.1. Exception Priority 


Exception priority rules determine which exception is taken first, if multiple exceptions 
occur on the same instruction. The Table 5-7. Shows the priority order of the exceptions. 


Table 5-7. Exception Priority Order 


Performance Counter 

Instruction Breakpoint (debug) 

Address error - Instruction fetch 

TLB refill - Instruction fetch 

TLB invalid - Instruction fetch 

Bus Error - Instruction fetch 

Single Step 

SYSCALL, BREAK, Reserved Instruction,* 
Floating Point Exception or Coprocessor Unusable* 
Interrupt 

Data address/value breakpoint (debug) 
SIO 

Integer overflow, Trap 

Address error - data access 

TLB refill - data access 

TLB invalid - data access 

TLB modified - data access 

Bus error - data access (lowest priority) 


* The exception priority between Reserved Instruction exception(RI) and Coprocessor 
Unusable exception(CpuU ) 


The exception priorities of the two exceptions are the same. However, when 
Status.CU[1] = 0, an attempt to execute any FPU (COP1) instruction causes a CpU 
exception. When Status.CU[1] = 1, the attempt is reported as an FPE(E):unimplemented 
FPU exception in the Cop1 sub-instructions. 

On the other hand, an attempt to execute any COPO class Reserved I nstruction causes 
an RI exception regardless Status.CU [0]. 
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5.5.2 Reset Exception 
Cause 


The RESET exception occurs when the Reset* signal is asserted and then deasserted. This 
exception is not maskable. 


Exception Level: 2 
Vector Address: 0xBF CO0000 
Processing 


The RESET exception vector is located within uncached and unmapped address space. 
Hence the cache and TLB need not be initialized in order to process the exception. 


The contents of all registers in the CPU are undefined when this exception is recognized, 
except for the following register fields: 
e Inthe Status register, 
Status.ERL and Status.BEV are set tol. 


Status.BEM is set to 0. 
All other bits except for 0-fixed bits are undefined. 
e Inthe Causeregister, 
Cause.EXC2 is set to 0 (to indicate that a Reset occurred) 
All other bits except for 0-fixed bits are undefined. 
e Inthe Configregister, 
DIE, 1CE, DCE, NBE, and BPE bits are set to 0. 
All other bits except for fixed-value, read-only bits are undefined. 
e The Random register is initialized to the value of its upper bound (47). 
e The Wired register is initialized to 0. 
e The Counter Enable flag in the Performance Counter Control register 
(PCCR.CTE) is set to 0. 
e Thebreakpoint address enable flags in the Breakpoint Control register, 
BPC.IAE, BPC.DRE, and BPC.DWE, are all set to 0. 
e Valid, Dirty, LRF, and Lock bits of the data cache and the Valid and LRF bits of 
the instruction cache are initialized to O on reset. 


Servicing 


The RESET exception is serviced by: 


e initializing all processor registers, coprocessor registers, caches, and the memory 
system 

e performing diagnostic tests 

e bootstrapping the operating system 
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5.5.3 Non-Maskable Interrupt (NMI) Exception 
Cause 


The Non-Maskable Interrupt (NMI) exception occurs in response to the falling edge of the 
NMI* signal. The NMI exception is maskable by setting the Status.ERL bit. It is 
recognized regardless of the settings of the Status.EXL, and Status./E bits. 


Exception Level: 2 
Vector Address: 0xBF CO0000 
Processing 


NMI and RESET exceptions share the same exception vector. This vector is located within 
uncached and unmapped address space; therefore, the cache and TLB need not be 
initialized in order to process the exception. 


When the NMI exception is recognized, all register contents are preserved with the 
following exceptions: 


e ErrorEPC register, which contains the restart PC, and Cause BD2 which records 
whether the NMI was recognized in a branch delay slot. 

e Status.ERL and Status.BE V flags are both set to 1. 

e CauseEXCZ2is set to1 (NMI). 


Servicing 


Note that the NMI service routine entry address does not depend on the Status.BE V flag. 
In fact, the Status.BEV bit is unconditionally set to 1 before the NMI handler is entered. 
It is up tothe NMI service routine to restore the setting of the Status.BE V bit prior to exit. 
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5.5.4 Performance Counter Exception 
Cause 


A lower-case performance counter exception occurs when a Performance counter overflows 
and conditions are met as described in Section 9.3.2. This exception is maskable by setting 
Status.ERL bit. 


Exception Level: 2 
Vector Address: 0x8000 0080 (DEV =0), OxBFCO 0280 (DEV =1) 
Processing 


The value of Cause. EXC2 is set to 2 (PerfC). The ErrorEPC register contains the address 
of the instruction where the Performance counter exception was detected unless it isina 
branch delay slot, in which case the ErrorEPC register contains the address of the 
preceding branch instruction and the Cause BD2is set. 


Servicing 


When this exception is recognized, control is transferred to the applicable service routine. 
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5.5.5 Debug Exception 
Cause 


A DEBUG exception occurs whenever hardware breakpoint conditions as described in 
Chapter 13 are detected. This exception is maskable by setting Status.ERL bit. 


Exception Level: 2 
Vector Address: 0x8000 0100 (DEV =0), OxBFCO 0300 (DEV =1) 
Processing 


The value of Cause.EXC2 is set to 3 (Dbg). The ErrorEPC register contains the address of 
the instruction where the debug exception was detected unless it is in a branch delay slot, 
in which case the ErrorEPC register contains the address of the preceding branch 
instruction and Cause BD2 is set. Note that the Load data value breakpoint exception is 
imprecise. That is, the instruction where the breakpoint is detected is not the load 
instruction that triggers the breakpoint; see Chapter 13 for more details. 


Servicing 


When this exception is recognized, control is transferred to the applicable service routine. 
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5.5.6 Address Error Exception 
Cause 


The Address Error exception occurs when an attempt is made to execute one of the 
following: 


load or store a doubleword that is not aligned on a doubleword boundary 
load, fetch, or store a word that is not aligned on a word boundary 

load or store a halfword that is not aligned on a halfword boundary 
reference the kernel address space from User or Supervisor mode 
reference the supervisor address space from User mode 


This exception is not maskable. 

Exception Level: 1 

Vector Address: 0x8000 0180 (BEV =0), OxBF CO 0380 (BEV =1) 
Processing 


The value of Cause ExcCade is set to 4 (AdEL) or 5 (AdES), depending on whether the 
exception was caused due to an instruction reference (AdEL), load operation (AdEL), or 
store operation (AdES). 


When this exception is recognized, the virtual address that was not properly aligned or 
that referenced protected address space is stored in the BadVAdar register. This update 
occurs even if the exception occurs within a level 1 or level 2 exception handler. The 
contents of the VPN field of the Context and EntryHi registers are undefined, as are the 
contents of the EntryL oregister. 


The EPC register contains the address of the instruction that caused the exception, unless 
this instruction is in a branch delay slot. If it is in a branch delay slot, the EPC register 
contains the address of the preceding branch instruction and Cause BD is set to indicate 
that the branch delay slot instruction actually caused the exception. 
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5.5.7 TLB Refill Exception 
Cause 


The TLB refill exception occurs when there is no TLB entry to match a reference to a 
mapped address space. This exception is not maskable. 


Exception Level: 1 


Vector Address: EXL =0: 0x8000 0000 (BEV =0), OxBF CO 0200 (BEV =1) 
EXL =1: 0x8000 0180 (BEV =0), OxBF CO 0380 (BEV =1) 


Processing 


The value of Cause.ExcCode is set to either a value of 2 (TLBL) or 3 (TLBS). This code 
indicates whether the exception was caused due to an instruction reference, load operation, 
or store operation. 


When this exception is recognized, the BadVAddr, Context and EntryHi registers are 
updated to hold the virtual address that failed address translation. The EntryH/ register 
also contains the ASID for which the translation fault occurred. These actions take place 
even if the exception is recognized within a level 1 or level 2 exception handler. The 
Random register normally contains a valid location in which to place the replacement TLB 
entry. The contents of the EntryLo register are undefined. The EPC register contains the 
address of the instruction that caused the exception, unless this instruction is in a branch 
delay slot, in which case the EPC register contains the address of the preceding branch 
instruction and CauseBD is set. 


The EPC register and BD bit in the Cause register point to the address of the instruction 
causing the exception. 


Servicing 


To service this exception, the contents of the Context register are used as a virtual address 
to fetch memory locations containing the physical page frame and access control bits for a 
pair of TLB entries. The two entries are placed into the EntryL o0/EntryL o1 register; the 
EntryHi and EntryLoregisters are then written into the TLB. 


It is possible that the virtual address used to obtain the physical address and access 
control information is on a page that is not resident in the TLB. This condition is 
processed by allowing a TLB refill exception in the TLB refill handler. This second 
exception goes to the common exception vector because the EXL bit of the Status register 
is set. 
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5.5.8 TLB Invalid Exception 
Cause 


The TLB invalid exception occurs when a virtual address reference matches a TLB entry 
that is marked invalid (TLB valid bit cleared). This exception is not maskable. 


Exception Level: 1 
Vector Address: 0x8000 0180 (BEV =0), OxBF CO 0380 (BEV =1) 
Processing 


The value of Cause.ExcCode is set to either 2 (TLBL) or 3 (TLBS). This code indicates 
whether the exception was caused due to an instruction reference, load operation, or store 
oper ation. 


When this exception is recognized, the BadVAdar, Context, and EntryHi registers are 
loaded with the virtual address that failed address translation. The EntryH/ register also 
contains the ASID for which the translation fault occurred. These actions occur even if the 
exception is recognized within a level 1 or level 2 exception handler. The Random register 
normally contains a valid location in which to put the replacement TLB entry. The 
contents of the EntryL oregister is undefined. 


The EPC register contains the address of the instruction that caused the exception unless 
this instruction is in a branch delay slot, in which case the EPC register contains the 
address of the preceding branch instruction and the BD bit of the Cause register is set. 


Servicing 


A TLB entry is typically marked invalid when one of the following is true: 


e a virtual address does not exist 

e thevirtual address exists, but is not in main memory (a page fault) 

e atrap is desired on any reference to the page (for example, to maintain a 
reference bit) 


After servicing the cause of a TLB Invalid exception, the TLB entry is located with TLBP 
(TLB Probe), and replaced by an entry with that entry’s Valid bit set. 
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5.5.9 TLB Modified Exception 
Cause 


The TLB modified exception occurs when a store operation generates a virtual address 
that matches a TLB entry that is marked valid but is not dirty and therefore is not 
writable. This exception is not maskable. 


Exception Level: 1 
Vector Address: 0x8000 0180 (BEV =0), OxBF CO 0380 (BEV =1) 
Processing 


The value of Cause.ExcCode is set to 1 (Mod) and the BadVAdadr, Context, and EntryHi 
registers contain the virtual address that failed address translation. The EntryHi register 
also contains the ASID for which the translation fault occurred. These actions occur even 
if the exception is recognized within a level 1 or level 2 exception handler. The contents of 
the EntryLo register is undefined. 


The EPC register contains the address of the instruction that caused the exception unless 
that instruction is in a branch delay slot, in which case the EPC register contains the 
address of the preceding branch instruction and the BD bit of the Cause register is set. 


Servicing 


The kernel uses the failed virtual address or virtual page number to identify the 
corresponding access control information. The page identified may or may not permit 
write accesses; if writes are not permitted, a write protection violation occurs. 


If write accesses are permitted, the page frame is marked dirty/writable by the kernel in 
its own data structures. The TLBP instruction places the index of the TLB entry that 
must be altered into the /ndex register. The EntryLo register is loaded with a word 
containing the physical page frame and access control bits (with the D bit set), and the 
EntryHi and EntryLoregisters are written intothe TLB. 
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5.5.10 Bus Error Exception 
Cause 


A Bus Error exception is raised when BUSERR* signal is asserted during bus transactions. 
This exception is masked when Status.BEM, Status.EXL or Status.ERL areset to 1. 


Exception Level: 1 
Vector Address: 0x8000 0180 (BEV =0), OxBF CO 0380 (BEV =1) 
Processing 


The value of Cause.ExcCode is set to 6 (IBE) or 7 (DBE), indicating whether the exception 
was caused due to an instruction reference (/BE), load operation (DBE), or store operation 
(DBE). The BadPAdoar is set to the physical address which caused a bus error when 
Status.BEM bit is 0. 


The EPC register and BD bit in the Cause register point to the address of the instruction 
currently being executed by the processor. 


Note that there is no necessary relationship between a bus error and the instruction being 
executed currently. For example, a bus error may be caused by instruction prefetch, or by 
a data cache line operation that is unrelated to any instruction. Furthermore, it could be 
caused by a load or store that was issued several instructions prior to the instruction that 
was executing when the bus error was recognized. 


If a bus error is caused by a load or store instruction, the instruction is retired. If the 
instruction is a store, the nature of how memory is updated depends on the memory 
subsystem’s design. If the instruction is a load, the value loaded into the destination 
register is indeterminate. If a data value breakpoint is pending for the memory address 
accessed, breakpoint recognition is implementation dependent. 


Servicing 


In the C790 the bus error exception is imprecise and as such difficult to recover from and 
continue processing. If a bus error occurs during instruction or data cache refills, the 
cache line loaded has undefined values in it. Since it is not possible in general to 
determine the offending address (from the EPC) the entire data and instruction cache 
contents should be invalidated by using Index Invalidate suboperation of the CACHE 
instruction. (See the CACHE instruction’s definition for details on how to do this.) 
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5.5.11 System Call Exception 
Cause 


A SYSCALL exception occurs as a result of executing the SYSCALL instruction. This 
exception is not maskable. 


Exception Level: 1 
Vector Address: 0x8000 0180 (BEV =0), OxBF CO 0380 (BEV =1) 
Processing 


The value of Cause. ExcCoade is set to 8 (Sys). The EPC register contains the address of the 
SYSCALL instruction unless it is in a branch delay slot, in which case the EPC register 
contains the address of the preceding branch instruction and Cause.BD is set. 


Servicing 
When this exception is recognized, control is transferred to the applicable system routine. 


To resume execution, the EPC register must be altered so that the SYSCALL instruction 
does not re-execute; this is accomplished by adding a value of 4 to the EPC register (EPC 
register + 4) before returning. 


If a SYSCALL instruction is in a branch delay slot, a more complicated algorithm, beyond 
the scope of this description, may be required. 
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5.5.12 BREAK Instruction Exception 
Cause 


A BREAK exception occurs as a result of executing the BREAK instruction. This exception 
is not maskable. 


Exception Level: 1 
Vector Address: 0x8000 0180 (BEV =0), OxBF CO 0380 (BEV =1) 
Processing 


The value of Cause.ExcCode is set to 9 (Bp). The EPC register contains the address of the 
BREAK instruction unless it is in a branch delay slot, in which case the EPC register 
contains the address of the preceding branch instruction and Cause.BD is set. 


Servicing 


When a BREAK exception is recognized, control is transferred to the applicable system 
routine. Additional distinctions can be made by analyzing the unused bits of the BREAK 
instruction (bits 25:6), and loading the contents of the instruction whose address the EPC 
register contains. A value of 4 must be added to the contents of the EPC register (EPC 
register +4) to locate the instruction if it resides in a branch delay slot. 


To resume execution, the EPC register must be altered so that the BREAK instruction 
does not re-execute; this is accomplished by adding a value of 4 to the EPC register (EPC 
register + 4) before returning. 


If a BREAK instruction is in a branch delay slot, interpretation of the branch instruction 
is required to resume execution. 
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5.5.13 Reserved Instruction Exception 
Cause 


The Reserved Instruction exception occurs when one of the following conditions occurs: 


e anattempt is made to execute an instruction with an undefined major opcode 
(bits 31:26) 

e an attempt is made to execute a SPECIAL instruction with an undefined minor 
opcode (bits 5:0) 

e anattempt is made to execute a REGIMM instruction with an undefined minor 


opcode (bits 20:16) 

e¢ anattempt is made to execute a MMI instruction with an undefined minor 
opcode (bits 10:0) 

e an attempt is made to execute a COPZ instruction with an undefined minor 
opcode (bits 25:21) 


Note: In the C790, 64-bit operations are always valid in User, Supervisor, and Kernel 
mode. 


This exception is not maskable. 

Exception Level: 1 

Vector Address: 0x8000 0180 (BEV =0), OxBF CO 0380 (BEV =1) 
Processing 


The value of Cause.ExcCode is set to 10 (R/). The EPC register contains the address of the 
reserved instruction unless it is in a branch delay slot, in which case the EPC register 
contains the address of the preceding branch instruction. 
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5.5.14 Coprocessor Unusable Exception 
Cause 


The Coprocessor Unusable exception occurs when an attempt is made to execute a 
coprocessor instruction for either: 


¢ acorresponding coprocessor unit that has not been marked usable via the 
Status.Cul[ ] bits or 

e COPO instructions, when the unit has been marked not usable and the process 
executes in either User or Supervisor mode. 


NOTE: COPO instructions always execute in Kernel mode, regardless of the 
setting of Status.CU/O]. Also note that the operation of the COPO instructions E| 
and DI is not controlled by Status.CU/O]. Instead, the Status.ED!/ bit specifies 
whether the EI and DI instructions execute in User and Supervisor modes. In 
case execution is suppressed, El and DI behave as no-operations in User and 
Supervisor modes; they do not signal an exception. 


The exception is not maskable. 

Exception Level: 1 

Vector Address: 0x8000 0180 (BEV =0), OxBF CO 0380 (BEV =1) 
Processing 


The value of Cause ExcCode is set to 11 (CpU) and the field Cause.CE (Coprocessor Usage 
Error) is set to indicate which of the four coprocessors was referenced. The EPC register 
contains the address of the unusable coprocessor instruction unless it is in a branch delay 
slot, in which case the EPC register contains the address of the preceding branch 
instruction. 


Servicing 


The coprocessor unit to which an attempted reference was made is identified by the CE 
(Coprocessor Usage Error) field, which result in one of the following situations: 


e If the process is entitled access to the coprocessor, the coprocessor is marked 
usable and the corresponding user state is restored to the coprocessor. 

e If the process is entitled access to the coprocessor, but the coprocessor does not 
exist or has failed, interpretation of the coprocessor instruction is possible. 

e If the BD bit is set in the Cause register, the branch instruction must be 
interpreted; then the coprocessor instruction can be emulated and execution 
resumed with the EPC register advanced past the coprocessor instruction. 
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5.5.15 Interrupt Exception 
Cause 


The Interrupt exception occurs when one of the three interrupt signals is asserted. The 
significance of the interrupts is dependent upon the specific system implementation. 


Each of the three interrupts can be masked by clearing the corresponding bit in the /nt- 
Mask field of the Status register, and all of the three interrupts can be masked at once by 
clearing the /E bit or EIE bit of the Status register. 


All three interrupts are also masked at once when the EXL or ERL bit of the Status 
register is set to 1. 


Interrupt I P[7] is set when the Count register is equal to the Compare register. 
Exception Level: 1 

Vector Address: 0x8000 0200 (BEV =0), OxBFCO 0400 (BEV =1) 

Processing 


The value of Cause.ExcCode is set to O (/nt). The /P field of the Cause register indicates 
current interrupt requests. It is possible that more than one of the bits can be 
simultaneously set (or even no bits may be set) if the interrupt is asserted and then 
deasserted before this register is read. 


Servicing 


If the interrupt is hardware-generated, the interrupt condition is cleared by correcting the 
condition causing the interrupt pin to be asserted. 


Due to the on-chip write buffer, a store to an external device (possibly clearing the 
interrupt) may not occur until after other instructions in the pipeline finish. Hence, the 
user must ensure that the store will occur before the return from exception instruction 
(ERET) is executed. This can be insured by executing a SYNC instruction. Otherwise the 
interrupt may be serviced again even though there is no actual interrupt pending. 
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5.5.16 SIO Exception 
Cause 


The SIO exception occurs when the S/O/nt signal is asserted. This exception is maskable 
by setting Status.ERL bit. 


Exception Level: 2 
Vector Address: 0x8000 0100 (DEV =0), OxBFCO 0300 (DEV =1) 
Processing 


The value of Cause EXC2 is set to 3(Dbg). The Cause S/OP is set to 1. The ErrorEPC 
register contains the address of the instruction where the SIO exception was detected 
unless if is in a branch delay slot, in which case the ErrorEPC register contains the 
address of the preceding branch insruction and Cause.BD2 is set. 


Servicing 


When this exception is recognized, control is transferred to the applicable service routine. 
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5.5.17 Integer Overflow Exception 
Cause 


An Integer Overflow exception occurs when an ADD, ADDI, SUB, DADD, DADDI or 
DSUB instruction results in a 2’s complement overflow. This exception is not maskable. 


Exception Level: 1 
Vector Address: 0x8000 0180 (BEV =0), OxBF CO 0380 (BEV =1) 
Processing 


The value of Cause.E xcCode is set to 12 (Ov). The EPC register contains the address of the 
instruction that caused the exception unless the instruction is in a branch delay slot, in 
which case the EPC register contains the address of the preceding branch instruction and 
the BD bit of the Cause register is set. 
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5.5.18 Trap Exception 
Cause 


The TRAP exception occurs when a TGE, TGEU, TLT, TLTU, TEQ, TNE, TGE!I, TGEIU, 
TLTI, TLTIU, TEQI, or TNE! instruction results in a TRUE condition. This exception is 
not maskable. 


Exception Level: 1 
Vector Address: 0x8000 0180 (BEV =0), OxBF CO 0380 (BEV =1) 
Processing 


The value of Cause.ExcCode is set to 13 (Tr). The EPC register contains the address of the 
instruction causing the exception unless the instruction is in a branch delay slot, in which 
case the EPC register contains the address of the preceding branch instruction and 
Cause.BD is set. 
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5.5.19 Floating-Point Exception 
Cause 


The Floating-Point exception is used by the floating-point coprocessor. This exception is 
not maskable. 


Exception Level: 1 
Vector Address: 0x8000 0180 (BEV =0), OxBF CO 0380 (BEV =1) 
Processing 


The common exception vector is used for this exception, and the FPE code in Cause 
register is set. 


The contents of the Floating-Point Control/Status register indicate the cause of this 
exception. 


This exception is cleared by clearing the appropriate bit in the Floating-Point 
Control/Status register. 


For an unimplemented instruction exception, the kernel should emulate the instruction; 
for other exceptions, the kernel should pass the exception to the user program that caused 
the exception. 
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6. Memory Management 


The C790 processor provides a memory management unit (MMU) which uses an on-chip 
translation look-aside buffer (TLB) to translate virtual addresses into physical addresses. 


The C790 supports the MIPS compatible 32-bit address and 64-bit data mode. Only 32-bit 
virtual and physical addresses have been implemented. There is no requirement for 
address sign extension and address error exception checking will not be done on the 
“upper” 32-bits (which are ignored). The only condition that will generate the address 
error exception will be address alignment errors and segment protection errors. In Kernel 
mode, there will be address error exception free program counter wrap-around from kseg3 
to kuseg. 


Since there is only one addressing mode, all the four MIPS ISAs (I, II, III, 1V) and the 
C790 specific ISA are available without any restrictions in all of the three processor modes 
(with the appropriate MIPS ISA coprocessor usable restrictions). As such the reserved 
instruction (RI) exception will occur only when the processor really tries to execute an 
undefined opcode. 


This chapter describes the processor virtual and physical address spaces, the virtual-to- 
physical address translation, the operation of the TLB in making these translations, and 
those System Control Coprocessor (COPO) registers that provide the software interface to 
the TLB. 
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6.1 Translation Look-aside Buffer (TLB) 


Mapped virtual addresses are translated into physical addresses using an on-chip TLB. 
The TLB is a fully associative memory that holds 48 entries, which provide mapping to 48 
odd / even page pairs (96 pages). When address mapping is indicated, each TLB entry is 
checked simultaneously for a match with the virtual address that is extended with an 
ASID stored in the low 8 bits of the EntryHi register. 


The address mapped to a page ranges in size from 4 KB to 16 MB, in multiples of four; 
that is, 4K, 16K, 64K, 256K, 1M, 4M, 16M. 


6.1.1. Translation Status 


In C790 processor, as the one implemented in R4000, each TLB entry holds two sets of 
mapping information for two odd/even page pair and therefore the translation result is 
categorized into three states, hit, miss and invalid. 


Upon address translation, if there is no virtual address match in all 48 entries, the 
translation result is categorized as TLB miss. 

In this case, an exception is taken and software refills the TLB from the page table 
resident in memory. Software can write over a selected TLB entry or use a hardware 
mechanism to write into a random entry. 


If there is a match on translation, the following takes place in the TLB hardware. 


1. The translation information for odd page and even page is read out of the matching 
entry. Also the page size is extracted at the same time. 


2. The TLB selects either of translation information in accordance with the page size 
information extracted above and the virtual address. 
This becomes the translation result in the TLB. 


The translation result includes a valid flag to indicate the translation information is valid 
or not. If the flag is marked as ‘valid’, the translation is handled as TLB hit. The physical 
page number is extracted from the TLB and concatenated with the offset to form the 
physical address (See Figure 6-1). 


If the flag is marked as ‘invalid’, the translation result is recognized as TLB invalid. In 
this case, an exception is taken to request the software to update the entry that got a 
match upon translation, by probing the TLB using TLBP operation. 


6.1.2 Multiple Matches 


Multiple match is the condition that there are two or more entries that match upon 
address translation. This is strictly prohibited and software is expected never to allow this 
to occur. 

The C790 processor does NOT provide any meanings to detect this in hardware, such as 
TLB shutdown. The result of this condition is undefined and the further execution may 
provide incorrect result. 
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6.2 Address Spaces 


This section describes the virtual and physical address spaces and the manner in which 
virtual addresses are converted or “translated” into physical addresses in the TLB. 


6.2.1. Virtual Address Space 


The C790 only implements 32 bits of virtual address space. There is no requirement for 
address sign extension and no checking will be done on the upper 32 bits of the address. 


Figure 6-1 shows the translation of a virtual address into a physical address. 


Virtual address 


1. Virtual address (VA) represented by 
the virtual page number (VPN) is 
concatenated with the ASID and 
compared with the tags in the TLB. 


2. If there is a match, the page frame 
number (PFN) representing the 
upper bits of the physical address 


(PA) is output from the TLB. 
ee 


4. The Offset, which does not pass 
through the TLB, is then chaos] [pm [ome 
to the PFN. 

(ee) address 


Figure 6-1. Overview of a Virtual-to-Physical Address Translation 


As shown in Figure 6-2, the virtual address is extended with an 8-bit address space 
identifier (ASID), which reduces the frequency of TLB flushing when switching contexts. 
This 8-bit ASID is in the COPO EntryHi register as described later in this chapter. 
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6.2.2 Physical Address Space 


Using a 32-bit address, the processor physical address space encompasses 4 GB. The 
following section describes the translation of a virtual address to a physical address. 


6.2.3 Virtual-to-Physical Address Translation 


Converting a virtual address to a physical address begins by comparing the virtual 
address from the processor with the virtual addresses in the TLB; there is a match when 
the virtual page number (VPN) of the address is the same as the VPN field of the entry, 
and either: 


e theGlobal (G) bit of the TLB entry is set, or 
e theASID field of the virtual address (taken from the 8-bit ASID field of the 
EntryHi register) is the same as the ASID field of the TLB entry. 


If there is no match, a TLB Miss exception is taken by the processor and software can 
refill the TLB from a page table of virtual / physical addresses in memory. 


If there is a virtual address match in the TLB, the physical address is output from the 
TLB and concatenated with the Offset, which represents an address within the page 
frame space. The Offset does not pass through the TLB. At the same time, the valid bit 
output from TLB is checked to qualify the translation. If this bit is not set, a TLB Invalid 
exception is taken by the processor and software can update the TLB. 


Virtual-to-physical translation is described in greater detail throughout the remainder of 
this chapter. Figure 6-9, shown at the end of this chapter, is a detailed flow diagram of 
this process. 
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6.2.4 32-bit Address Translation Mode 


The C790 supports only 32-bit address translation mode. 64-bit addressing mode is not 
supported. 


Figure 6-2 shows the virtual-to-physical address translation of a 32-bit address. 


e Thetop portion of Figure 6-2 shows a virtual address with a 12-bit, or 4-KB, 
page size, labeled Offset. The remaining 20 bits of the address represent the 
VPN, and index the 1M-entry page table. 

e The bottom portion of Figure 6-2 shows a virtual address with a 24-bit, or 16- 
MB, page size, labeled Offset. The remaining 8 bits of the address represent the 
VPN, and index the 256-entry page table. 


Virtual Address with 1M (27°) 4-Kbyte pages 
39 32 31 29 28 12 11 0 


8 LY 20 12 


Virtual-to-physical Offset passed 
Bits 31, 30 and 29 of the virtual Hansalenn te uncrendeet@ 
: physical 
address select user, supervisor, TLB memory 
or kernel address spaces. 32-bit Physical Address 
31 0 
PFN Offset 


Virtual-to-physical Offset passed 
translation in TLB unchanged to 
physical 


memory 


39 32.31 29 28 24 23 0 


Virtual Address with 256 (2°) 16-Mbyte pages 


Figure 6-2. 32-bit Mode Virtual Address Translation 
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6.2.5 Operating Modes 


The processor has the three standard MIPS operating modes: 


e User mode 
e Supervisor mode 
e Kernel mode 


Selection between the three modes can be made by the operating system (when in Kernel 
mode) by writing into Status register’s KSU field. The processor is forced into Kernel 
mode when the processor is handling a Level 1 exception (the EXL bit is set - also called 
the Exception Level mode in R-series processors) or a Level 2 exception (the ERL bit is set 
- also called the Error Level mode in R-series processors). 


In the following table, dashes represent ‘don’t cares’. 


Table 6-1 Processor Modes 


[______Deseription | KSU | ERL | EXL | 
eC 
eS 


Feet Kemet ode 00 0 


32-bit Kernel mode (Level 1 exception) Se cic Oa i = 


32-bit Kernel mode (Level 2 exception) eee ee ee 


Figure 6-3 shows a state transition among these three modes. 


Exception 
User Mode 
—ERET & KSU =10 


Exception 


ERET & KSU = 01 
Supervisor 
Mode 


Figure 6-3 State Transition among Operating Modes 
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Table 6-2 summarizes address space for each operating mode. 


Table 6-2. Address Space 


32-bit 
Supervisor 
Mode 


Virtual 32-bit User -bi 
Address Mode i 


OxFFFF FFFF 
to 
0xE000 0000 
OxDFFF FFFF 
to 
0xC000 0000 
OxBFFF FFFF 
to 
0xA000 0000 
Ox9FFF FFFF 
to 
0x8000 0000 


Ox7FFF FFFF 
to 
0x0000 0000 


Address 
Error 


sseg (0.5 GB) 
Address Mapped 
Error 


Address 
Error 


useg (2 GB) 
Mapped 


suseg (2 GB) 
Mapped 


32-bit Kernel 
Mode 


kseg3 (0.5 GB) 
Mapped 


ksseg (0.5 GB) 
Mapped 


kseg1 (0.5 GB) 
Unmapped* 
Uncached 
kseg0 (0.5 GB) 
Unmapped* 
Cached** 


kuseg (2 GB) 
Mapped 
(becomes 


unmapped if 
ERL is 1) 


*Note: Virtual addresses of Kernel segments, ksegO and kseg/, are not mapped through the 
TLB and always translated into physical addresses from 0x0000 0000 to Ox1FFF FFFF. 


** Note: 
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6.2.6 User Mode Operations 


In User mode, a single, uniform virtual address space, labeled User segment, is available; 
its size is: 
e 2 GB (231 bytes) (useg) 
Figure 6-4 shows User mode virtual address space. 
Virtual Address 32-bit 
Ox FFFF FFFA 


Address 
Error 


Ox 8000 0000 


useg 


0x 0000 0000 
Figure 6-4. User Mode Virtual Address Space 


The User segment starts at address 0x0000 0000 and the current active user process 
resides in useg. The TLB identically maps all references to useg from all modes, and 
controls cache accessibility. 


The processor operates in User mode when the Status register contains the following bit- 
values: 

e KSU bits =10Q2 

e andEXL =0 

e andERL =0 
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Table 6-3 lists the characteristics of the User mode segment, useg . 


Table 6-3. User Mode Segments 


Address Bit Status Register Segment Virtual Address Segment 
Values Bit Values Name Range Size 
ASEM ERE] 


A[31] = useg 0x0000 0000 through 2 Gbyte 
Ox7FFF FFFF (2' bytes) 


User Mode, User Space(useg) 


In User mode(KSU =102 in the Status register), when the most-significant bit of the 32- 
bit virtual address is set to 0, the useg virtual address space is selected; it covers the 231 
bytes (2 GB) of the current user address space. All valid User mode virtual addresses have 
their most-significant bit cleared to 0; any attempt to reference an address with the most- 
significant bit set while in User mode causes an Address Error exception. 


The system maps all references to useg through the TLB. Bit settings within the TLB 
entry for the page determine the cacheability of a reference. The virtual address is 
extended with the contents of the 8-bit ASI D field to form a unique virtual address. 


This mapped space starts at virtual address 0x0000 0000 and runs through Ox7FFF FFFF. 
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6.2.7 Supervisor Mode Operations 


Supervisor mode is designed for layered operating systems in which a true kernel runs in 
C790 Kernel mode, and the rest of the operating system runs in Supervisor mode. 


The processor operates in Supervisor mode when the Status register contains the 
following bit-values: 

e KSU =012 

e andEXL =0 

e andERL =0 


Virtual Address 32-bit 


Ox FFFF FFFF  Adaress 


Ox E000 0000 error 
0.5 GB 


0x C000 0000 |_Mapped oe 
Ox A000 0000 error 
Ox 8000 0000 error 
2GB 
suseg 


Mapped 
0x 0000 0000 


Figure 6-5. Supervisor Mode Virtual Address Space 


Table 6-4. Supervisor Mode Segments 


Address Bit Status Register Segment Virtual Address Segment 
Values Bit Values Name Range Size 
| KSU | EXL | ERL | 
[ 


A[31] = 0 O15 suseg 0x0000 0000 through 2 Gbyte 
Ox7FFF FFFF (2° bytes) 

A[31:29] = 1102 O12 sseg 0xC000 0000 through 0.5 Gbyte 
OxDFFF FFFF (2”° bytes) 


Supervisor Mode, User Space (suseg) 


In Supervisor mode (KSU =012 in the Status register), when the most-significant bit of 
the 32-bit virtual address is set to 0, the suseg virtual address space is selected; it covers 
the 23! bytes (2 Gbytes) of the current user address space. The virtual address is extended 
with the contents of the 8-bit ASID field to form a unique virtual address. 


This mapped space starts at virtual address 0x0000 0000 and runs through Ox7FFF FFFF. 
Supervisor Mode, Supervisor Space (sseg) 


In Supervisor mode (KSU =012 in the Status register), when the three most-significant 
bits of the 32-bit virtual address are 1102, the sseg virtual address space is selected; it 
covers 229-bytes (512 Mbytes) of the current supervisor address space. The virtual address 
is extended with the contents of the 8-bit ASI D field to form a unique virtual address. 


This mapped space begins at virtual address OxC000 0000 and runs through OxDF FF 
FFFF. 
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6.2.8 Kernel Mode Operations 


The processor operates in Kernel mode when the Status register contains one of the 
following values: 


e KSU =002 
e ofr EXL=1 
e orERL=1 


The processor enters Kernel mode whenever an exception is detected and it remains in 
Kernel mode until an Exception Return (ERET) instruction is executed. The ERET 
instruction restores the processor to the mode existing prior to the exception. 


Kernel mode virtual address space is divided into regions differentiated by the high-order 
bits of the virtual address, as shown in Figure 6-6. 


Table 6-5 lists the characteristics of the kernel mode segments. 


Virtual Address Physical Address 


32-bit 32-bit 
Ox FFFF FFFF 
kseg3 —— Translated by TLB 


Ox FFFF FFFF 


Ox E000 0000 


ksseg ——» Translated by TLB 
Ox C000 0000 


0.5 GB 
Unmapped ksegl ——————»> 
Uncached 


0x A000 0000 


ksegQ —————> 


Ox 8000 0000 


2 GB 
Mapped kuseg ——> 
(becomes Translated by TLB 
unmapped if 
ERL=1) 
Ox 1FFF FFFF 
0.5 GB 
Kernel Boot 
——_—_—_—_» and I/O 


Ox 0000 0000 


Ox 0000 0000 


Figure 6-6. Kernel Mode Address Space 
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Table 6-5. Kernel Mode Segments 
Address Bit Status Register Segment Virtual Address Segment 
Values Bit Values Name Range Size 
| KSU | EXL | ERL | 
[ 


A[31] = 0 kuseg 0x0000 0000 through 2 Gbyte 
Ox7FFF FFFF (2*" bytes) 

A[31:29] = 1002 ksegO 0x8000 0000 through 0.5 Gbyte 
Ox9FFF FFFF (2° bytes) 


A[31:29] = 1012 kseg1 OxA000 0000 through 0.5 Gbyte 
OxBFFF FFFF (2° bytes) 
A[31:29] = 1102 ksseg 0xC000 0000 through 0.5 Gbyte 
OxDFFF FFFF (2”° bytes) 
A[31:29] = 1112 kseg3 OxE000 0000 through 0.5 Gbyte 
OxFFFF FFFF (2”° bytes) 


Kernel Mode, User Space (kuseg) 


In Kernel mode (KSU =002 or EXL =1 or ERL =1 in the Status register), when the most- 
significant bit of the virtual address, A[31], is a 0, the 32-bit kuseg virtual address space is 
selected; it covers the full 23! bytes (2 GB) of the current user address space. The virtual 
address is extended with the contents of the 8-bit ASID field to form a unique virtual 
address. 


When ERL =1 in the Status register, the user address, kuseg, region becomes a 23!-byte 
unmapped, uncached address space (that is, mapped directly to physical addresses 0x0000 
0000 through Ox7FFF FFFF). 


Kernel Mode, Kernel Space 0 (kseg0) 


In Kernel mode (KSU =002 or EXL =1 or ERL =1 in the Status register), when the most- 
significant three bits of the virtual address are 1002, 32-bit ksegO virtual address space is 
selected; it is the 229-byte (512 MB) kernel physical space. 


References to ksegO are not mapped through the TLB; the physical address selected is 
defined by subtracting 0x8000 0000 from the virtual address. The KO field of the Config 
register, described in this chapter, controls cacheability and coherency. 


Kernel Mode, Kernel Space 1 (kseg1) 


In Kernel mode (KSU =002 or EXL =1 or ERL =1 in the Status register), when the most- 
significant three bits of the 32-bit virtual address are 1012, 32-bit kseg1 virtual address 
space is selected; it is the 229-byte (512 MB) kernel physical space. 


References to kseg1 are not mapped through the TLB; the physical address selected is 
defined by subtracting OxA000 0000 from the virtual address. 


Caches are disabled for accesses to these addresses, and physical memory (or memory- 
mapped I/O device registers) is accessed directly. 


Kernel Mode, Supervisor Space (ksseg) 


In Kernel mode (KSU =002 in the Status register), when the most-significant three bits of 
the 32-bit virtual address are 1102, the ksseg virtual address space is selected; it is the 
current 229-byte (512 MB) supervisor virtual space. The virtual address is extended with 
the contents of the 8-bit ASID field to form a unique virtual address. 
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Kernel Mode, Kernel Space 3 (kseg3) 


In Kernel mode (KSU =002 in the Status register), when the most-significant three bits of 
the 32-bit virtual address are 1112, the kseg3 virtual address space is selected; it is the 
current 229-byte (512 MB) kernel virtual space. The virtual address is extended with the 
contents of the 8-bit ASID field to form a unique virtual address. 
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6.3 System Control Coprocessor 
The System Control Coprocessor (COPO) is implemented as an integral part of the CPU, 


and supports memory management, address translation, exception handling, and other 


privileged operations. The COPO registers shown in Figure 6-7 plus a 48-entry TLB make 
up theMMU. 


Each COPO register has a unique number that identifies it; this number is referred to as 
the register number. F or instance, the PageMask register is register number 5. 


EntryLo0 Index Context BadV Addr 
EntryHi 2" 0* 4" 8* 

10* EntryLo1 
= 

1* 

PageMask Status 
5* 12* 
Wired 

6* 


(“Safe” entries) 
(See Random register, 
contents of TLB Wired) 
127 0 


*Register number 


Figure 6-7. COPO Registers and the TLB 
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6.3.1. Format of a TLB Entry 


Figure 6-8 shows the TLB entry formats for the 32-bit address translation modes. Each 
field of an entry has a corresponding field in the EntryHi, EntryLo0, EntryLol, or 
PageM ask registers. For example, the Mask field of the TLB entry is also held in the 
PageM ask register. 


32-bit Mode 
127 121 120 109 108 96 
7 12 13 
95 777675 7271 64 
Tee pIEIEB VPN2 G bo ASID 
entry in 32- 
bit mode of 19 } . 8 
C790 63 58 57 38 37 35 34 33 32 
processor | [0 | PEN Lc [ply 
6 20 3 111 
31 26 25 65 32 1 O 
za PEN _¢ [polyol 
6 20 3 111 


Figure 6-8. Format of a TLB Entry 


The format of the EntryHi, EntryLo, EntryLol, and PageMask registers are nearly the 
same as the TLB entry. The one exception is the G/obal field (G bit), which is used in the 
TLB, but is reserved in the EntryHi register. The following register tables describe the 
TLB entry fields shown in Figure 6-8. 
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PageMask Register 
31 25 24 1312 0 
12 13 


MASK Page comparison mask. 
0 Reserved. Must be written as zeroes, and returns zeroes when read. 


EntryHI Register 


31 13 12 8 7 0 
19 5 8 


VPN2 Virtual page number divided by two (maps to two pages). 
ASID Address space ID field. An 8-bit field that lets multiple processes share the TLB; each 
process has a distinct mapping of otherwise identical virtual page numbers. 
0 Reserved. Must be written as zeroes, and returns zeroes when read. 


EntryLo0 Register 


31 26 25 6 5 3.2 1 #20 
a ee ee 
6 20 3 1 1 1 
EntryLo1 Register 
31 26 25 6 5 3.2 1 =#0 
ee es ee 
6 20 3 1 1 1 
PFN Page frame number; the upper bits of the physical address. 
Cc Specifies the TLB page coherency attribute; see Table 6-7. 
D Dirty. If this bit is set, the page is marked as dirty and, therefore, writable. This bit is 
actually a write-protect bit that software can use to prevent alteration of data. 
V Valid. If this bit is set, it indicates that the TLB entry is valid; otherwise, a TLB invalid 
exception occurs. 
G Global. If this bit is set in both LOO and LO1, then the processor ignores the ASID 
during TLB lookup. 
0 Reserved. Must be written as zeroes, and returns zeroes when read. 


The TLB page coherency attribute (C) bits specify whether references to the page should 
be either of cached, uncached, or uncache-accelerated. Table 6-6 shows the coherency 
attributes selected by the C bits. 
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Table 6-6 TLB Page Coherency (C) Bit Values 


| Od Reserved 


| 6 Reseed 


Write-back with allocate fetches the line with the missed data both on load misses and on 
store misses. Therefore, storing data to such pages is always performed to the data cache 
and will not be sent to the write buffer. 


Uncached accelerated data provides a special kind of acceleration for handling uncached 
data. On a load of an uncached accelerated data item (which can range in size from a byte 
to a quadword) the C790 will always fetch an aligned 128-byte quantity from memory. 
These eight quadwords will be placed in a special 128-byte buffer called the uncache 
accelerated buffer, or UCAB in the CPU. Any subsequent loads which “hit” the UCAB will 
get the data from the UCAB. This process reduces bus traffic. The UCAB will be 
invalidated under the following conditions: 


e Any load operation which doesn't hit the buffer, or 

e any store operation, or 

e aSYNC (or SYNC.L) operation, or 

e any exception. 
For uncached accelerated stores, the C790 write-back buffer (128-bit x 8) also has some 
special features. On the first store of an uncached accelerated write the write-back buffer 
will mark the fact that this is an uncached accelerated write to a particular address. 
Subsequent uncached accelerated stores which hit within the same 128-bit address 
boundary will be accumulated (gathered) within the same write buffer entry. This process 


of data gathering reduces bus traffic. The gathering process will be terminated under the 
following conditions: 


e Any store which can’t be gathered (different attribute or different address), or 
e any load operation, or 

e aSYNC (or SYNC.L) operation, or 

e any exception. 
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6.4 Virtual-to-Physical Address Translation Process 


In the supported 32-bit mode, the highest 8 to 20 bits of the virtual address (depending 
upon the page size) are compared to the contents of the TLB virtual page number. The 8- 
bit ASID is only compared if the global bit, G, is not set. 


If a TLB entry matches, the physical address and access control bits (C, D, and V) are 
retrieved from the matching TLB entry. While the V bit of the entry must be set for a 
valid translation to take place, it is not involved in the determination of a matching TLB 
entry. 


Figure 6-9 illustrates the TLB address translation process. 
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Virtual Address (Input) 


For valid 
address space, see 

the section describing 
Operating Modes 
in this chapter. 


TLB TLB 
Invalid Refill 


Exception 


Physical Address (Output) 


Figure 6-9. TLB Address Translation 
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If there is no TLB entry that matches the virtual address, a TLB miss exception occurs. If 
the access control bits (D and V) indicate that the access is not valid, a TLB modified or 
TLB invalid exception occurs. 


If the C bits equal 0102 (Uncached) or 1112 (Uncached Accelerated), the physical address 
that is generated directly accesses main memory, bypassing the cache. 


6.5 TLB Instructions 


Table 6-7 lists the instructions that the CPU provides for working with the TLB. See 
Appendix C for a detailed description on these instructions. 


Table 6-7. TLB Instructions 


OpCode Description of Instruction 
TLBP Translation Look-aside Buffer Probe 


TLBR Translation Look-aside Buffer Read 
TLBWI Translation Look-aside Buffer Write Index 
TLBWR Translation Look-aside Buffer Write Random 
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7. Caches 


The C790 core contains both an instruction cache and a separate data cache The 


processor also contains a small size of read only cache memory for uncached accelerated 
area. 


This chapter describes the cache structures, operation of the caches, and cache control. 
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7.1 Cache Features 


The two caches are configured as shown in Table 7-1: 


Table 7-1. Cache Configuration 


| Cache ———|_—Size__{ Organization | LineSize_| Refill Size _| 
Instruction Cache 32 KB 2-Way 64 bytes 64 bytes 


Data Cache 32 KB 2-Way 64 bytes 64 bytes 


The following are the main features of the caches: 


Separate I nstruction Cache and Data Cache 

Virtually indexed and physically tagged caches 

64 Byte line size 

64 Byte Refill size 

2-way set-associative cache for higher performance 
Write-back policy for the Data Cache 

Missed quadword first sequential order burst refills for the Data Cache 
Data Cache line locking 

Non-Blocking Loads 

Data cache supports multiple Hits under a single miss 
No Snoop capability 


No cache snoop capability has been provided. The user may choose to use CACHE 
instructions to keep coherency between caches and main memory. 


yo 
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7.2 Organization of the Caches 


Organization of the caches is illustrated in Figure 7-1 and Figure 7-2. Both the 
Instruction Cache and the Data Cacher are 2-way set-associative. Each cache line consists 
of a tag and data. Each cache has a data line size of 64 bytes. 


7.2.1. Data Cache 


The Data Cache is connected to the CPU via a 128-bit bus. Therefore, the Data Cache can 
supply to the CPU or the coprocessors up to a quadword of data per access. 


The following diagram shows Data Cache structure. Tags are discussed in detail in a later 
section. 


Phys.TagO Datad Phys.Tag1 Data1 


DATA 
64 bytes 


< 


Virtual Index ) 


64 bytes 


Ce ee ee ee ee ee ee 


256 
entries 
L Lock Bit For description, see Section 7.3.7, Data Cache Lock Function 
R LRF Bit For description, see Section 7.3.1, Line Replacement Algorithm 
Vv Valid Bit For description, see Section 7.2.3, Tag Structure 
D Dirty Bit For description, see Section 7.2.3, Tag Structure 


Figure 7-1. Organization of Data Cache 
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7.2.2 Instruction Cache 


The Instruction Cache is connected to the CPU pipeline via a 64-bit bus. This enables the 
CPU to fetch two instructions per cycle from the Instruction Cache. 


The following diagram shows Instruction Cache structure. Tags are discussed in detail in 
a later section. 


Phys.TagO Datad Phys.Tag1 Data1 


DATA 
64 bytes 


Virtual Index ) i 64 bytes 


256 
entries 
R LRF Bit 
Vv Valid Bit 


Figure 7-2. Organization of Instruction Cache 
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7.2.3 Tag Structure 


The general structure of a tag consists of a set of state bits and a physical page frame 
number or PFN field. The Data Cache and the Instruction Cache have different numbers 
of state bits; for more information, refer to the discussions in the following sections. 


The size of the tag and the number of virtual address bits indexing the caches are 
dependent upon the size of the cache, address space, and set associativity. The C790 
supports 32-bit virtual and physical addresses as shown in the figure below: 


Virtual Address (VA) 


31 1413 12 11 0 
VPN OFFSET 


Physical Address (PA) 


31 1413 12 11 0 


PFN OFFSET 


Since the cache line size is fixed at 64 bytes, that is, four quadwords per entry, the Tag 
Cache associated with each way will have one tag for every four quadwords. Table 7-2 
shows cache sizes, address bits and tag size. 


Table 7-2. Cache Size and Access Bits 


Cache Size Way Size of Cache Virtual Tag Cache Tag Virtual 
Each Way Address Size of Each Address 
Index Bits Way Index 


2 WAY | 256 x 64 Bytes 256 x 20 Bits 


2 WAY | 256 x 64 Bytes 256 x 20 Bits 


While the caches are indexed by the virtual address, the tag comparison is physical. This 
is possible because the caches and the TLB are accessed in parallel. So, when the tags 
have been accessed, the page frame number is ready to be compared against the 
translated virtual address for a cache hit or miss. 


C790 Programming Note: 


Overlapping of the cache index bit range and PFN bit range causes the “cache aliasing 
problem”. C790 does not have any hardware mechanisms to detect the cache aliasing. It is 
programmer's responsibility to avoid the cache aliasing. When a physical page is mapped 
on the different virtual pages, VPN[13:12] have to be same in both virtual address. The 
conservative way to avoid this is that VPN[13:12] = PFN[13:12] whenever a page is 
mapped. 
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7.2.3.1. Data Cache Tag Structure 


In addition to the physical page frame number (PFN), each Data Cache Tag entry also 
contains additional Cache State bits as shown below. All lines in both ways of the Data 
Cache have these four state bits. Cache line state bits are also illustrated in Figure 7-1. 


Data Cache Tag Fields 


PEN 


Two state bits, DIRTY and VALID, together identify which of three states the Data Cache 
is in: Valid Clean, Valid Dirty, or Invalid. Table 7-3 shows the state of the Data Cache 
line as a function of DIRTY and VALID bits. 


Table 7-3. Data Cache Line States 


Dirty Bit(D) | Valid Bit(V) | Cache Line State | Even if Cache Instruction 
x 0 Invalid try to set V=0, D =1 
state, Dirty bit is forced to 
poo | Valid Clean | pero in 6790 
Valid Dirty} implementation. 


The LAF bit is the Least-Recently-F illed line replacement bit. 


The LRF bits serve as a replacement algorithm between the two ways of the Data Cache. 
A refill access to a cache linein a way will flip the LRF bit to point to the other way as the 
least recently filled. For details of the LRF line update operation refer to Section 7.3.1. 


As Figure 7-1 illustrates, Data Cache lines in each way have a LOCK bit. The LOCK bit, 
as explained in Section 7.3.7, Data Cache L ock Function, locks lines in one of the ways to 
keep data from being replaced. 


7.2.3.2 Instruction Cache Tag Structure 


In addition to the physical page frame number (PFN), each Instruction Cache Tag entry 
also contains two additional Cache State bits as shown below. All lines in both ways of the 
Instruction Cache have these two state bits. 


Instruction Cache Tag Fields 


PEN 


The Instruction Cache VAL/D state bit defines whether each line is in the Valid or Invalid 
states. 


The LRF bit is the Least-Recently-Filled line replacement bit. LRF bits serve as a 
replacement algorithm between the two ways of the Instruction Cache. A refill access toa 
cache line in a way will flip the LRF bit to point to the other way as the least recently 
filled. For details of LRF line update operation refer to Section 7.3.1. 
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7.2.4 State of Cache Tags After Reset 


For all Data Cache tags the following fields are initialized to 0 upon reset: 
Valid 

Dirty 

LRF 

Lock 


For all Instruction Cache tags the following fields are initialized to O upon reset: 
e Valid 
e LRF 


All other fields in the Instruction Cache and the Data Cache contents are undefined upon 
reset. 
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7.3 Cache Operations 


This section describes cache operation in regard to read/write policies, coherency, write 
back policy, and the lock function. 


7.3.1 Line Replacement Algorithm 


The line replacement policy for both the Instruction Cache and the Data Cache is based on 
the Least Recently Filled (LRF) algorithm. In this policy, the LRF bit of a way is modified 
(inverted) only when a cache line refill occurs to the corresponding way. Load/store 
accesses to the Data Cache do not modify the LRF bit. The bit indicating which way is the 
least recently filled way is the XOR of the two LRF bits of the two ways of the cache. 


Table 7-4. LRF Line Replacement Algorithm 


Current Current Refill New New 
Way0d Way1 Way Way0d Way1 
LRF LRF “ie LRF 


The column under XOR indicates the way which could be refilled (line replaced) on the 
next refill at that line location. Note that the table shown above is valid only when none 
of the ways of the cache line is locked. If a way of the cache line is locked, then regardless 
of the state of the LRF bits, the least recently filled way will always be the unlocked way. 


The behavior is also slightly different for Instruction and Data Caches when one of the 
way is invalid. For the Data Cache the algorithm is followed exactly as given above 
irrespective of the ways being valid or invalid. For the Instruction Cache the algorithm 
given above is followed as long as both the ways are valid. Once a way becomes invalid, 
then that way gets priority of being filled over the valid way irrespective of the LRF bits. 


7.3.2 Non-blocking Loads and Hit Under Miss 


The Data Cache supports non-blocking load and hit under miss to improve performance. 
When a Data Cache miss occurs or an uncached load instruction is issued, Non-blocking 
load allows the pipeline to continue instruction execution until one of the following occurs: 


1. A subsequent non-load/store/pref instruction has data dependency with the load 
that is pending (to be retired). 


2. A pipelined stalls. 
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Hit under miss is a feature that allows access (load or store) to the Data Cache while a 
previous load miss (cached, uncached or uncached accelerated), a previous store miss 
(cached) or a previous prefetch miss (cached) is still pending. In this case, access to the 
cache proceeds and the pipe does not stall. 


Uncached loads also do not stall the pipeline while they are pending (to be retired). The 
pipeline continues instruction execution until one of the following occurs: 


1. A subsequent |oad/store/pref instruction has data dependency with the load that 
is pending (to be retired). 

2. A Data Cache miss occurs or a miss occurs on the Uncached Accelerated Buffer. 

3. An Uncached load instruction is issued. 


To summarize, Non-blocking load and Hit under miss allow the pipelene to continue 
instruction execution until one of following occurs when a Data Cache miss occurs or an 
uncached load instruction is issued: 


1. A subsequent instruction has data dependency with the load that is pending (to 
be retired). 


2. A Data Cache miss occurs or a miss occurs on the Uncached Accelerated Buffer. 
3. An uncached load instruction is issued. 
4. A pipelined stalls. 


Loads to the GPRs (IU) and FPRs (FPU) all follow the non-blocking protocol (when it is 
enabled). Loads to COP1 is always blocking. 


7.3.3 Cache Miss and Hit Operations 


In case of a Data Cache hit, the cache provides data to the CPU in 128-bit (single 
quadword) quantities. In case of an Instruction Cache hit, the cache provides data 
(“instruction”) in 64-bit quantities. CPU reads or writes to the Data Cache in quantities 
less than 128 bits are specified by the least significant four bits of the address, bits 3:0. 


Cache misses are processed by the cache controller in 64-byte quantities - one cache line. 
Since the caches are connected to the system bus via a 128-bit bus, cache refill takes a 
burst of 4 bus cycles (8 CPU cycles) that is, four quadwords are transferred in 4 bus cycles 
(actual transfer time can be more due to bus arbitration etc). These reads are performed in 
sequential order for both the Instruction Cache and the Data Cache. The quadword for 
which the address missed is always fetched first. 


Table 7-5 indicates the sequential order. PA[5:4] are two least-significant address bits that 
are put out on the CPU Bus. Figure 7-3 illustrates the case where the second quadword, 
shaded area, missed and shows the order in which data are read from main memory. 
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Table 7-5. Quadword Retrieved Address PA[5:4] 


128 bits 128 bits 128 bits 128 bits 
EE 


Read order Third Second First Fourth 
Figure 7-3. Read Missed Processed in Sequential Order 


In case of a write miss to the Data Cache (for an allocate-on-write address), the cache 
controller will read in sequential order a cache line from main memory. Whether the cache 
line, being replaced, is first written out to memory or not - due tothe DIRTY bit being set - 
is discussed in the next section. 


The Instruction Cache processes cache misses in burst of 4 quadwords, just like the Data 
Cache. Furthermore, in case of an Instruction Cache miss, the pipeline starts in the same 
cycle the final quadword is stored into the Instruction Cache. 


7.3.4 Data Cache Writeback Policy 


Data cache lines are written back to the memory in the following cases: 


1. The processor executes Index Write Back Invalidate CACHE instruction 
suboperation as defined in Appendix C and the line data are dirty. Or Hit 
Writeback I nvalidate or Hit Writeback without I nvalidate CACHE 
suboperations hit on Data Cache and the line data are dirty. 


2. A read or write miss occurs and the line data are dirty. In this case the line has 
to be written to memory before it can be replaced by the miss data. 
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7.3.5 Data Cache State Transitions 


As discussed previously, lines in the Data Cache can be in one of several states: /nvalid, 
Valid Clean or Valid Dirty. 


Invalid means the Data Cache entry does not contain valid data. Upon a miss, the cache 
can load data into this cache line with no further actions. 


The Valid Clean state indicates that there are valid data in the Data Cache line and they 
are the same as memory. All writeback segments have their data in the Valid Clean state 
until they are written to by the processor. 


The C790 supports the write-back protocol, hence the need for a Valid Dirty state. A Data 
Cache line transitions to the Valid Dirty state when the cache line is written to without 
reflecting the operation on the bus - the writeback protocol. In this case, the data in the 
cache does not match the data in memory. 


Figure 7-4 shows the transition diagram of the Data Cache performing according to the 
writeback policy. For details on the CACHE operation, refer to Appendix C. 


CACHE Index Invalidate 

CACHE Index WriteBack Invalidate 
CACHE Hit WriteBack Invalidate (if hit) 
CACHE Hit Invalidate (if hit) 

CACHE Index Store Tag (if V = 0) 
Reset 


Read Miss 


PREF Miss 
CACHE Index Store Tag (if V = 1, D = 0) 
CPU CACHE Hit W/B without Invalidate (if hit) 


Write 


Write Miss 
CACHE Index Store Tag (if V = 1, D = 1) 


Write 
Read 


Figure 7-4. Data Cache Transition Diagram, Writeback Protocol 
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7.3.6 Instruction Cache State Transitions 
Cache lines in the Instruction Cache can be in either of two states: Invalid or Valid. 


Invalid means the Instruction Cache entry does not contain valid instruction data. Upon a 
miss, the cache can load instructions into this cache line with no further actions. 


The Valid state indicates that there are valid instructions in the cache line and so there is 
no need for miss processing. 


The transition diagram for the Instruction Cache is simple; refer to Figure 7-5. For 
details on the CACHE instructions refer to Appendix C. 


CPU 


Read 
ao CACHE Hit bas 
CACHE Index Store Tag (if V = 0) Invalidate CACHE Index Store Tag (if V = 1) 
CACHE Index Invalidate (if hit) CPU Read Miss 
Reset CACHE Fill 


Figure 7-5. Instruction Cache Transition Diagram 


7.3.7 Data Cache Lock Function 


In a 2-way set-associative Data Cache, such as the one present in the C790, there is no 
explicit way of forcing data to be retained in the cache. The LRF-based mechanism 
dynamically determines which cache line should be replaced. A Data Cache lock function 
has been defined to aid in retaining critical pieces of data in the Data Cache under strict 
program control. 


Each entry on each way of the Data Cache has a Lock (L) bit. The Lock bit aids in locking 
the line by writing directly into it. After locking the line, the LRF bit is no longer 
meaningful. Thus, if one of the ways for a particular line is locked, the other way is the 
only way available for caching. Thus, once a line is locked with a particular physical 
address tag, any other virtual address which maps onto the same cache line will have only 
a direct mapped location rather than a 2-way location. 


To lock the Data Cache, the following two CACHE instruction suboperations can be used: 
INDEX STORE TAG (DCACHE) 
INDEX STORE DATA (DCACHE) 


For details of the above CACHE instruction suboperation refer to Section 7.6. To lock a 
Data Cache line, the following code sequence can be used: 
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li t0,0x00010068 //PTagLo = 0x00010, D=V=L=1, R=0 
mtc0O t0,$28 //t0 -> TagLo 

sync.1 

cache 18,0 (r0) //TagLo -> Tag (way0) 

sync.1 

la s0,0x00010000 

sw t1,0(s0) //store contents of tl into 


//locked cache line 


In this example, the tag has been modified using the CACHE instruction and the data has 
been updated using a Store instruction. 


The following restrictions apply to line locking: 
e Theresult of relocking a locked line is undefined 
e Theresults of locking both ways of a cache line are undefined 


To unlock Data Cache lines, the following code sequence can be used: 


Aa t0,0x00010060 //D=V=1, L=R=0 

mtc0 t0,$28 //t0 -> TagLo 
sync.1 

cache 18,0 (r0) //TagLo -> Tag (way0) 
sync.1 


7.3.7.1. Operations During Lock 


When the lock bit is set for cache line (index), only the other way is available for handling 
cache misses. The misses are blocking. A write access to a locked line in the Data Cache 
takes place only to the cache without affecting the state of memory. Writes to locked cache 
lines will not set the DIRTY (D) bit. 


7.3.8 Relationship Between Cached and Uncached Operations 


Uncached and Uncached Accelerated load and store operations are always executed in 
order on the CPU bus. Cached load operations can precede earlier store data present in 
buffers on the CPU bus. All store data present in buffers prevents a SYNC (or SYNC.L) 
instruction from completing until the store data has been sent either to the Data Cache or 
the CPU bus. 


Stores with the uncached and uncached accelerated attributes bypass the Data Cache 
completely. 
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7.4 Uncached Accelerated Buffer 


The C790 has a small size of read only cache memory for uncached accelerated area to 
reduce bus traffic. This read only cache, the Uncached Accelerated Buffer (UCAB), can 
introduce data to itself only by refill process due to a load miss on the UCAB. Once load 
instructions hit on the UCAB, data are provided directly from the UCAB. The UCAB is 
invalidated under the following conditions: 


Any load operation which doesn’t hit the UCAB, or 
Any store operation, or 

A SYNC (or SYNC.L) operation, or 

Any exception 


Snoop is not supported for the UCAB. 


7.4.1. UCAB Configuration 
The UCAB is configured as shown in Table 7-6. 


Table 7-6. UCAB Configuration 


| ttt—“‘C*dL =SCsSSize__| Organization |__Line Size_| Refill Size_| 
Uncached Accelerated Buffer 128 bytes 128 bytes 128 bytes 


7.4.2 Tag Structure 


The UCAB is also indexed by the virtual address, the tag comparison is physical. Table 7-7 
shows the UCAB size and access bits. 


Table 7-7. UCAB Size and Access Bits 


wa UCAB Virtual] UCAB | UCAB Tag Virtual 
y Index Bits | Tag Size Index Bits 


Bytes 


The least significant 5 bits of the UCAB Tag ([11:7]) is identical with the virtual address 
[11:7]. The UCAB Tag has one bit of valid bit. The UCAB Tag doesn’t have Ditty, LRF, 
Lock bits. The valid bit of UCAB Tag is initialized to 0 upon reset. 


7.4.3. Non-blocking Loads and HiT under Miss 


The UCAB also supports non-blocking load and hit under miss as well as the Data Cache 
Non-blocking load and Hit under miss allow the pipeline to continue instruction execution 
until one of following occurs when an Uncached Accelerated Buffer miss occurs: 


1. A subsequent instruction has data dependency with the load that is pending (to 
be retired). 


2. A Data cache miss occurs or a miss occurs on the UCAB. 
3. An uncached load instruction is issued. 
4. A pipelined stalls. 
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7.5 Cache Control Registers 


The operations of the caches are controlled by certain programmable bits in the Config 
register. These bits are: 


ICE 
DCE 
IC 
DC 
IB 
DB 


Instruction Cache Enable 
Data Cache Enable 
Instruction Cache Size 
Data Cache Size 

I cache Line Size 

Dcache Line Size 


For details of these configuration bits refer to the COPO register section. 


The two cache tag registers TagLo and TagHi are 32-bit read/write registers that hold the 
tag and state of the cache line during initialization and diagnostics. The Tag registers are 
manipulated by MTCO and CACHE instructions. 


TagLo 


31 


TagHi 


12 


11 7 6 5 4 3.2 0 
een Ee 


Specifies physical address bits 31:12 

Cache State DIRTY bit (Not used for the Instruction Cache) 
Cache State VALID bit 

LRF Bit 

LOCK Bit (Not used for the Instruction Cache) 

Must be written as zeros, will return zero on reads 


The TagHi register contains instruction- and operation-specific items (see the next 


section). 
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7.6 CACHE Instruction 


For information on the CACHE instruction, please refer to Appendix C. 
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8. CPU Bus 


The C790 CPU core is connected to the rest of the system}, and to external devices, 
through the group of on-chip C790 system bus signals called the CPU Bus. This chapter 
defines the architecture of the CPU Bus and describes it in the context of an overall sys- 
tem design. 


This chapter describes the following: 


e theCPU Bus architecture and agents on the CPU Bus 
e thetypes of transactions possible between agents on the bus 
e thebus protocols for transactions 


' The system consists of a DMA Controller (DMAC) as a master, and various slave devices. 
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8.1 Introduction 


The CPU Bus is an on-chip bus in a highly integrated processor. All agents (see definitions 
section 8.1.1 below) on the CPU Bus are equipped with a CPU Bus interface unit connect- 
ed via CPU Bus signals. An agent acts like a master when it initiates reads or writes on 
the bus. An agent acts like a slave when it responds to reads or writes initiated by a mas- 
ter. For the CPU Bus to operate properly, an arbiter is needed, to perform arbitration be- 
tween the CPU and the other bus masters. The arbiter is located in the CPU, and CPU 
arbitration behavior is discussed in Section 8.5.1, Arbitration Operations. 


The following are main features of the CPU Bus: 


e Separate data and address buses (Demulti plexed operation) 
128-bit data bus 

Clocked synchronous operations 

Peak transfer rate of 2.1GB/sec (@L33 MHz bus clock) 
8/16/32/64/128-bit and burst accesses 

Multimaster capability 

Pipelined operations 

Noturn-around or dead cycles between transfers 


The CPU Bus does not provide: 


e Cache coherency support 
e §=Split transactions 
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8.1.1. Terminology 


Address Phase is the cycles during which an address is driven on the CPU Bus through 
the cycle the address is acknowledged. 


Agent refers to different devices on the CPU Bus. 


Assert means taking a signal to its active level. An active high signal is “1” when asserted, 
and an active low signal is “O” when asserted. 


CPU means the C790 CPU. Theterms CPU and C790 are used interchangeably in this 
chapter. 


Data Phase is the cycles during which data are driven on the bus through the cycle they 
are acknowledged. 


DMAC is the DMA Controller in the system. 
Master means the current bus master on the CPU Bus. 
MEM refers to the system memory controller. 


Negate/Deassert means taking a signal to its inactive state. An active high signal is “O” 
when deasserted. An active low signal is “1” when negated. 


* (after signal name) means active low signal. 


8.1.2 Signal Naming Convention 


Table 8-1 shows the prefixes used for naming signals in a system incorporating the C790 
CPU Bus. 


Table 8-1. System Signal Naming Convention 


Signal Signal Type 
Prefix 


Signals from the CPU multiplexed or logically combined with the DMAC signals 
to form the system signals. These signals include: CPUADDR, CPUBE%*, 


CPURD*, CPUWR*, CPUTSIZE, CPUASTART*, CPUDSTART*, CPUDATA. 
The combined or multiplexed signals from any agents on the CPU Bus. These 
signals include: SYSADDR, SYSBE*, SYSRD*, SYSWR*, SYSTSIZE, 
SYSASTART*, SYSDSTART*, SYSAACK*, SYSDACK*, SYSDATA. 
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8.2 CPU Bus Architecture 


The CPU Bus design is a synchronous pipelined bus with separate data (128-bit) and 
address buses running at half the clock frequency of the CPU. The CPU is connected to 
the rest of the system and external devices through this bus. Figure 8-1 illustrates the 
architecture of the bus and identifies different agents that can be on the bus. 


CPU 


Memory 
Controller 


VO 
Devices 


Figure 8-1. CPU Bus Architecture 
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8.2.1 CPU Bus Connectivity for Address and Control Paths 
Figure 8-2 illustrates the system-level interconnections for address paths of the CPU Bus. 


Support logic is needed to handle the fact that the system contains multiple masters. 
AGNT* is used to control the multiplexer in the support logic that selects a master to be 
connected to the CPU Bus. 


CPUASTART * 
DMAASTART * 


SYSASTART * 


CPUADDR, 
CPUBE*, 


ePouRe. Controller 
CPUWR* 


DMAADDR, 

DMATSIZE, 

DMARD* 

DMAWR* 

|_| 
|| /O 
Devices 

= — —- || 


IOAACK* 


Figure 8-2. CPU Bus Address and Control Path Connections in System 
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8.2.2 CPU Bus Connectivity for Data Paths 
Figure 8-3 illustrates the system-level interconnections for data paths of the CPU Bus. 


For read cycles, the support logic must control the multiplexer so that the correct source of 
data is put on SYSDATA. 


For write cycles, the support logic must detect whether the cycle is a CPU cycle or aDMA 
cycle, and use this to control the multiplexer. 


CPUDSTART* SYSDSTART* 


DMADSTART* 


Memory 
Controller 


DMAC 


Ke) 
Devices 


SYSDACK* 


IODACK* 


Figure 8-3. CPU Bus Data Path Connections in System 
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8.3 CPU Bus Signal Descriptions 


This section describes the CPU Bus signals and their usage in different bus operations. 


8.3.1 Address Bus Signals 


CPUADDR[31:4] CPU address bus 


CPUADDR[31:4] bits are valid during the address phase and can be sampled by the slave 
when CPUASTART* is sampled low. 


SYSADDR{[31:4] System address bus 


SYSADDR[31:4] are multiplexed outputs selecting between CPUADDR[31:4] and DMA 
address. They are valid during the address phase and can be sampled by the slave when 
SYSASTART* is sampled low. 


CPUBE[15:0]* CPU byte enables 


CPUBE[i]*, driven during the address phase, indicates valid data on byte i of 
CPUDATA[127:0] during the data phase. CPU byte enables can be sampled by the slave 
when CPUASTART* is sampled low. CPU byte enables are used only in CPU single cycles. 


SYSBE[15:0]* System byte enables 


SYSBE[i]*, driven during the address phase, indicates valid data on byte i of 
SYSDATA[127:0] during the data phase. System byte enables can be sampled by the slave 
when SYSASTART* is sampled low. System byte enables are used only in CPU single 
cycles. 
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CPUTRANSTYPE[4:0] CPU transaction type 


CPUTRANSTYPE[4:0], driven during the address phase, indicates the type of operation. 
CPU transaction type can be sampled by the slave when CPUASTART* is sampled low. 


Table 8-2. Bus Transaction Types 


CPURD* CPU read 


The CPU asserts this signal to indicate a read operation. This signal can be sampled when 
CPUASTART* is sampled low. This signal is active during the address phase. CPURD* is 
used in transfers initiated by the CPU. 


CPUWR* CPU write 


The CPU asserts this signal to indicate a write operation. This signal can be sampled 
when CPUASTART* is sampled low. This signal is active during the address phase. 
CPUWR* is used in transfers initiated by the CPU. 
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CPUTSIZE[1:0] CPU transfer size 


While driven by the CPU, these signals indicate the size of the transfer in the current 
CPU initiated bus cycle. They are driven during the address phase and can be sampled 
starting at the edge where CPUASTART* is sampled low. 


Table 8-3. CPU Transfer Size 


CPUTSIZE[1:0] 
ee ee 1 Quadword (Single Cycle) 
SYSTSIZE[2:0] System transfer size 


While driven by the system, these signals indicate the size of the transfer in the current 
system bus cycle. They are driven during the address phase and can be sampled starting 
at the edge where SYSASTART* is sampled low. 


CPUASTART* CPU address start 


Driven by the CPU, it indicates the start of the address phase. Address, byte enable, and 
control signals (CPUADDR[31:4], CPUBE[15:0]}*, CPURD*, CPUWR*, and CPUTSIZE) 
can be sampled to determine the type of cycle requested starting where CPUASTART* is 
sampled low. CPUASTART* is driven active for only one cycle. 


SYSASTART* System address start 


SYSASTART* is driven by the system; it indicates the start of the address phase. Address, 
byte enable, and control signals can be sampled to determine the type of cycle requested 
starting where SYSASTART* is sampled low. SYSASTART* is driven active for only one 


cycle. 
SYSAACK* System address acknowledge 


This signal is an input to all the agents on the CPU Bus indicating that address and con- 
trol signals have been sampled by the slave. The master terminates the address phase one 
cycle after sampling SY SAACK * low. 


CPUDATA[127:0] CPU data bus 
This is a 128-bit data bus output from the CPU. 
SYSDATA[127:0] System data bus 


This is the 128-bit data bus input to all devices on the CPU Bus. 
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CPUDSTART* CPU data start 


During read/write operations, this output from the CPU indicates the start of data phase. 
For CPU write operations, the slave can sample data from the bus one cycle after CPUD- 
START * has been asserted. For CPU read operations, the slave can output data on the bus 
any cycle after the cycle CPUDSTART * has been asserted. 


SYSDSTART* System data start 


During read/write operations, this output from the system indicates the start of data 
phase. Data transfer can begin one cycle after SYSDSTART* has been asserted. For DMA 
cycles, if the slave, providing the data, cannot supply data in the next cycle after the as- 
sertion of SYSDSTART*, it is the responsibility of the designer to come up with a new 
DMA protocol. 


SYSDACK* System data acknowledge 


This signal is an input to all the agents on the bus indicating the valid status of data on 
the bus. During read cycles, it indicates read data are available on the bus to be sampled 
by the master. During write cycles, it indicates the slave has sampled the data. This sig- 
nal should be asserted for each data transfer during burst operations. During read trans- 
actions, data are sampled one cycle after SYSDACK* has been asserted. During write 
transactions, the master drives new data on the bus one cycle after detecting SYSDACK* 
low. 


BUSERR* Bus error 


This signal is an input to the CPU and the DMAC which indicates that a bus error has oc 
curred during the transaction. BUSERR* serves to terminate the bus protocol and return 
bus ownership to the CPU. 


INT[1 :0]* Interrupt request lines 
These signals are interrupt inputs to the CPU. 

SIOINT* Serial I/O interrupt request 
This line provides the serial |/O interrupt from the I/O controller. 

NMI* Non-maskable interrupt 
Non-maskable interrupt input to the CPU. 

SYSBIGENDIAN Big Endian enable 


This input signal is sampled during cold reset and make CPU to operate as big endian 
when it is asserted. The input level of this signal must not be changed during the opera- 
tion. 
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CPCONDO Coprocessor conditions 
These lines are an input to the CPU as test conditions for some of the branch instructions. 
RESET* Reset 


Input to the CPU. When this line is asserted, the CPU, DMAC and slave devices execute a 
reset. 


CPUCLK CPU clock 
CPU clock 
BUSCLK Bus clock 


Bus clock: 1/2, 1/3 or 1/4 frequency of the CPUCLK. 
AREQ* Address bus request 


This signal is an output from the DMAC to the CPU. When it is asserted, the DMAC re 
quests the address bus mastership. 


AGNT* Address bus grant 


This signal is an output from the CPU to grant the bus mastership to the DMAC. This 
signal is asserted in response to assertion of the AREQ* signal. 


REL* Bus release request 


This signal is asserted by the CPU to request that the current bus owner release the CPU 
Bus. 
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8.4 Overview of CPU Bus Operations 


This section discusses CPU Bus operations; it covers processor requests, DMA operations, 
and bus error operation. 


In this section descriptions show CPU signals followed by the system lines, in parentheses, 
onto which they are asserted. For example: CPUASTART* (SYSASTART*) means 
CPUASTART* is asserted on the SYSASTART* line. Where a value is given, the bits 
output by the CPU are shown, followed by the bits, in parentheses, on the system lines. 
For example if we have 11 on CPUTSIZE[1:0], during a CPU bus cycle, then we will get 
011 on the SYSTSIZE[2:0]. This will be shown as 11 (011). 


8.4.1 CPU Bus Operations 


The CPU Bus is different from conventional buses in that it allows pipeline operations. In 
this case, pipeline implies up to two outstanding requests before any data transaction has 
taken place. For instance, the CPU may issue two back-to-back read requests to main 
memory before any data have been returned. Note that at any time, there can only be two 
outstanding requests on the bus. The master requiring more than two operations has to 
wait until the first request has been serviced completely prior to issuing the third one. 


8.4.2 Processor Requests 


The CPU issues single requests, burst requests or a series of requests to other agents on 


the bus. These requests are referred to as processor requests initiated through the CPU 
Bus interface. 


The processor requests are in response to the following system events: 

Load miss 

Store miss 

Write-back buffer writes (dirty data cache lines, uncached writes, etc.) 
Uncached loads and uncached accelerated loads 

Instruction miss and uncached instruction fetch 


Processor read/write requests can be a burst, quadword, or partial quadword of data to 
and from the main memory or any other system resources. A processor-initiated burst is 
always 4 quadwords. 


8.4.2.1 Read Requests 


The CPU initiates read requests by driving address and control on the bus and asserting 
CPUASTART* (SYSASTART*) to indicate valid address and control. The CPU will keep 
driving address and control until the slave device has acknowledged the address phase by 
asserting address acknowledge, SYSAACK*. For burst reads, the CPU drives CPUTSIZE 
(SYSTSIZE) to 11 (011) to indicate burst reads. The CPU also indicates that it is ready to 
accept read data by asserting CPUDSTART* (SYSDSTART*). The slave device returns the 
requested data on the data bus by asserting SYSDACK*, data acknowledge. 
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8.4.2.2 Write Requests 


The CPU initiates write requests by driving address and control on the bus and asserting 
CPUASTART* (SYSASTART*). The CPU also drives data on the bus and indicates that by 
asserting CPUDSTART* (SYSDSTART*). The slave device accepts the address and data 
by asserting SYSAACK* and SYSDACK*, respectively. Burst writes are indicated by 
driving CPUTSIZE (SYSTSIZE) to 11 (011) during the address phase. 


8.4.3 Bus Error Operations 


Bus error occurs when the CPU or DMA initiates cycles but there are no devices on the 
CPU Bus responding to the cycles. The absence of response to either the address phase or 
the data phase will cause the bus error condition. The bus error is always imprecise. 


When bus error occurs, all the agents including the CPU, DMAC, and slave devices on the 
CPU Bus will terminate the current bus cycle. 


In the case where CPU is the initiator of the cycle, there can be two types of bus error: 


e Data load/store bus error 
e Instruction fetch bus error 


Bus error sets the corresponding exception bit in the CAUSE register. Subsequently, the 
CPU will jump to the proper error handler for the examination of the exception. However, 
the bus error exception is imprecise. There is no guarantee that the CPU can recover from 
this error condition. 


In case the DMAC is the initiator of the cycle, the types of bus error depends on the im- 
plementation of the DMAC. After bus error occurs, the DMAC will release the bus master- 
ship back to the CPU and assert interrupt or NMI to the CPU. The interrupt or NMI rou- 
tine will then handle the bus error condition for the DMAC. 
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8.5 CPU Bus Transaction Protocols and Timing 


This section describes transaction protocols and the timing for the following CPU Bus op- 
erations: 


Arbitration 

CPU single operations (one quadword) 

CPU burst operations (four quadwords) 

CPU non-pipelined single operations (one quadword) 
CPU non-pipelined burst operations (four quadwords) 
Bus error operations 


8.5.1 Arbitration Operations 


An arbiter is required to mediate between devices requesting the CPU Bus. The arbiter is 
located in the CPU. The CPU is the default bus master; AREQ* and AGNT* are both 
deasserted during RESET. 


A master other than the CPU may request the bus by asserting the request signal, AREQ*. 
In response to the AREQ* signal, the CPU will issue the grant signal, AGNT*, to grant 
the address bus to the requesting master. In the cyde AGNT* is sampled active by the bus 
master, the master starts the address phases and deasserts AREQ* in the beginning of 
the last address phase. When the corresponding data phases commences, the CPU or the 
requesting master starts the data transfers depending on the DMA transfer. Data phases 
follow the exact order of address phases. The arbitration signals are shown in Figure 8-4. 


AGNT* 
Bus Master 


REL* 


CPU Bus 


Figure 8-4. Connection of Arbitration Signals 


The arbitration priority in using the CPU Bus is that the DMAC always has higher priori- 
ty than the CPU. When both the CPU and the DMAC arbitrate for the CPU Bus, the arbi- 
ter grants the bus mastership to the DMAC. The CPU can assert REL* tothe DMAC in an 
effort to get the bus ownership back from the DMAC. The CPU will proceed with the 
transfer once the DMAC has released the CPU Bus. 


The arbitration cycles and protocol are shown in Figure 8-5. In response to the DMAC asserting its 
request AREQ*, the arbiter asserts AGNT* in cyde 3 which is the arbitration c”de. The DMAC 
samples AGNT* asserted and begins its address phases. When the DMAC asserts to begin the last 
address phase, it deasserts its request line AREQ* in cyde 4. The arbiter then waits for the 
SY SAACK* cydeto deassertt AGNT* to release bus mastership back tothe CPU. 
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BUSCLK 
AREQ* 
AGNT* 
SYSADDR 


SYSASTART* 


SYSAACK* 


Figure 8-5. Arbitration Protocol 
8.5.1.1 Cycle Stealing 


Cycle stealing refers to the CPU's ability to preempt a master in order to perform a bus 
operation. This operation could be either due to the write back buffer (WBB) being almost 
full (having more than 64 bytes filled up) or the CPU needing to perform an instruction or 
data read. These operations are collectively referred to as cycle stealing operations. 


Figure 8-6 illustrates the cycle stealing protocol. The arbiter asserts the REL* (Release) 
signal in response to the CPU's request cycles. The master deasserts its request after 
having finished its operations. When the master has begun the last address phase with 
the master deasserts the ARE Q* signal indicating to the arbiter that the bus will be relin- 
quished; as indicated in cycle 9. When the address phase ends, the address bus is returned 
to the CPU by the deassertion of AGNT* in cycle 12. The arbiter deasserts REL* at the 
same time AGNT* is deasserted. The data phases follow the same order as the address 
phases. 


BUSCLK 
AREQ* 
AGNT* 
SYSADDR 
SYSASTART* 


SYSAACK* 


REL* 


Figure 8-6. Cycle Stealing Protocol 
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8.5.2 CPU Single Operations 
CPU Single operations transfer one quadword. 


In single operations, the CPU drives the address, byte enables, and the read/write signals 
and indicates their valid status by asserting CPUASTART* (SYSASTART*). The slave 
samples valid address and control lines and responds by asserting SYSAACK*. In single 
operations, CPUTSIZE (SYSTSIZE) is always 00 (000). 


When the CPU detects SYSAACK* active and is ready to put another address on the bus, 
it will start another address phase. The bus only supports two levels of address pipelining. 
That means only two address phases can be outstanding before any data phase begins. 


The CPU indicates that it is ready to accept/supply data by asserting CPUDSTART* 
(SYSDSTART*) one cycle prior to actually accepting/supplying it. For read cycles, the 
slave supplies the data and indicates that the data is ready by asserting SY SDACK*. For 
write cycles, the CPU supplies data one cycle after CPUDSTART* (SYSDSTART*) is as- 
serted, and the slave accepts the data by asserting SYSDACK*. 


8.5.2.1 CPU Single Reads 


The fastest CPU single read is 2 cycles. Address and data phases for AddrA illustrate the 
fastest CPU single read cycle. The CPU asserts CPUASTART* (SYSASTART*) to begin 
the address phase in cycle 1. The slave device asserts SYSAACK* in cycle 1 to indicate 
that it has sampled the address. The CPU then begin another address phase in cycle 3. 
The assertion of SYSDACK* by the slave device in cycle 1 triggers the CPU to sample 
SYSDATA at the end of cycle 2. 
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Figure 8-7. CPU Single Reads 
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8.5.2.2 CPU Single Writes 


The fastest CPU single write is 2 cycles. Address and data phases for AddrA illustrate the 
fastest CPU single write cycle. The CPU always drives data onto CPUDATA one cycle 
after the assertion of CPUDSTART* (SYSDSTART*). For example, in, the CPU drives 
CPUDATA in cycle 2 which is one cycle after the assertion of CPUDSTART* 


(SYSDSTART*) in cycde 1. The slave device samples SYSDATA one cycle after the 
assertion of SYSDACK*. 
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Figure 8-8. CPU Single Writes 
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8.5.2.3 CPU Single Read-Write-Read-Write Cycles 


All adjacent address phases are read-write or write-read cycles. AddrA is a read address 
and AddrB is a write address, and so on. 
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Figure 8-9. CPU Single Read-Write-Read-Write Cycles 
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8.5.3 CPU Burst Operations 


CPU Burst operations transfer four quadwords. In burst operations, the CPU drives the 
address and control signals and indicates their validity by asserting CPUASTART* 
(SYSASTART*). The slave samples valid address and control lines and asserts SY SAACK * 
to acknowledge the address phase. The address phase is the cycles from CPUASTART* 
(SYSASTART*) asserted to one cycle after SY SAACK * is asserted. 


When the CPU detects SYSAACK* active and has another address ready, it will start ano- 
ther address phase. 


The CPU indicates that it is ready to accept/supply data by asserting CPUDSTART* 
(SYSDSTART*) one cycle prior to actually accepting/supplying it. For read cycles, the 
slave supplies the data and indicates that data are valid by asserting SYSDACK* one c- 
cle prior to the data being available. For write cycles, the CPU supplies data one cycle af- 
ter CPUDSTART* (SYSDSTART*) is asserted, and the slave accepts the data by asserting 
SYSDACK*. For burst cycles, there are many SYSDACK* for data transfer. 


The CPUTSIZE (SYSTSIZE) indicates the number of quadwords in the transfer. The CPU 
initiated cycles use only values of either 00 (for CPU Single operations) or 11 (for CPU 
Burst operations), which are single and burst of 4 quadwords respectively. 


8.5.3.1. CPU Burst Reads 


The fastest CPU burst read is 5 cycles. Address and data phases for AddrA illustrate the 
fastest CPU burst read cycle. There are four SYSDACK* sent by the slave device for every 
CPU burst read cycle. The slave device asserts SYSDACK* in cycle 1, 2, 3, and 4 to indi- 
cate that data can be sampled at the end of cycle 2, 3, 4, and 5 by the CPU. 
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Figure 8-10. CPU Burst Reads 
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8.5.3.2 CPU Burst Writes 


The fastest CPU burst write is 5 cycles. Address and data phases for AddrA illustrate the 
fastest CPU burst write cycle. After assertion of CPUDSTART* (SYSDSTART*) in cycle 1, 
the CPU drives the first data on CPUDATA in cycle 2. As SYSDACK* is sampled asserted 
in cycles 1, 2, 3, and 4, the CPU drives a new data on CPUDATA at the end of cycles 2, 3, 
4, and 5. 
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Figure 8-11. CPU Burst Writes 
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8.5.3.3. CPU Burst Read-Write Cycles 


All adjacent address phases are read-write or write-read cycles. AddrA is a read address 
and AddrB is a write address, and so on. 
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Figure 8-12. CPU Burst Read-Write Cycles 


8.5.3.4 CPU Burst Write-Read Cycles 


All adjacent address phases are read-write or write-read cycles. AddrA is a write address 
and AddrB is a read address, and so on. 
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Figure 8-13. CPU Burst Write-Read Cycles 
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8.5.4 CPU Non-Pipeline Single Operations 


The CPU Bus can support non-pipeline operations as well as pipeline operations. The 
non-pipeline operations are done simply by delaying the assertion of SYSAACK* until the 
last SYSDACK* of the bus transaction. The advantage of this is that the peripheral does 
not need to save the current address; it just decodes the address on the address bus for the 
current operation. Using this mode of operation simplifies the peripheral interfaces to the 
CPU Bus but it degrades the system performance. 

8.5.4.1 CPU Non-Pipeline Single Reads 


All adjacent address phases are read cycles. 
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Figure 8-14. CPU Non-Pipeline Single Reads 
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8.5.4.2 CPU Non-Pipeline Single Writes 


All adjacent address phases are write cycles. 
oe es = > a 
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Figure 8-15. CPU Non-Pipeline Single Writes . 


8.5.5 CPU Non-Pipeline Burst Operations 
8.5.5.1. CPU Non-Pipeline Burst Reads 
All adjacent address phases are read cycles. 
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Figure 8-16. CPU Non-Pipeline Burst Reads 
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8.5.5.2 CPU Non-Pipeline Burst Writes 


All adjacent address phases are write cycles. 
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Figure 8-17. CPU Non-Pipeline Burst Writes 
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8.5.6 Bus Error Operations 


Bus error occurs when there are no slave responding to the address or data phases of the 
bus cycle. When bus error occurs, the current bus operation is terminated, and the system 
proceeds with the next bus operation. Without bus error detection, the CPU Bus would 
remain waiting indefinitely for the SYSAACK* or SYSDACK* signals. 


Bus error is generated by the CPU Bus monitor logic. The monitor logic basically makes 
sure that for both address and data phases in the current CPU Bus cydle, there are 
SYSAACK* and SYSDACK*, respectively. In the case, when there is no SYSAACK* or 
SYSDACK* or response to the address or data phase for a pre-defined period of time for 
the current CPU Bus cycle, bus error is generated by asserting BUSERR* for one CPU 
Bus clock. Bus error has higher priority than SYSAACK* or SYSDACK* if they are de 
tected in the same cycle. 


Bus error is always asserted in reference to the data phase of the cycle. The exact timing 
is the cycles from SYSDSTART* asserted to the cycle before the assertion of the next 
SYSDSTART*. The bus error signal is sampled when the system is waiting for the asser- 
tion of SYSDACK* and/or SYSAACK* of the operation corresponding to the current data 
phase. For example, if the address phase of a certain cycle has no response from the slave 
devices, the bus monitor logic will wait until the SYSDSTART* of the corresponding data 
phase before generating the bus error. The bus monitor logic can generate the bus error 
any time before the next data phase begins. 


8.5.6.1. Bus Error Exceptions 


As mentioned before, two operations can be pipelined on the CPU bus, and these two op- 
erations can be initiated from either the CPU as master or the DMAC as master. 


If the bus error occurs in the CPU initiated operation, the following occurs: 


e abus error exception due to instruction fetch or data access is generated 

e thebus error instruction or data address is recorded in the BadPAdar Register 
of COPO 

e the Status.BEM bit is set (This bit is the bus error mask (BEM) in the COPO 
Status Register). 


Once a bus error occurs, any further bus errors are ignored until Status.BEM is cleared by 
the bus error exception handler. 


If the bus error occurs in the DMA initiated operation (DMA cycle), the DMAC will finish 
the pending pipeline operations, disable itself, release the CPU Bus, and cause an inter- 
rupt. The interrupt routine will then service and reenable the DMAC accordingly. Table 
8-4 summarizes the exception generation: 


Table 8-4. Bus Error Exceptions 


Operation with the Bus Error | Exception Generated 
CPU Initiated Instruction Fetch Bus Error Exception - Instruction Fetch 


CPU Initiated Data Access Bus Error Exception - Data Access 
DMA Cycle Interrupt Exception 
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8.5.6.2 CPU Bus Cycle Termination 


Two pipeline operations can be in progress at any time, but if a bus error occurs, only the 
operation with the bus error is terminated. That is, the occurrence of a bus error with one 
master does not affect the program execution of another master. For example, if bus error 
occurs when the first and second operations are initiated from the DMAC and CPU, re 
spectively, the CPU Bus will terminate the DMA operation and continue with the CPU 
operation. Table 8-5 summarizes CPU Bus cycle sequence for all types of CPU Bus cycle 
termination. 


Table 8-5. Operation Termination Sequence 
with Bus Error Operation 

CPU Cycle #1 CPU Cycle #2 1. CPU Cycle #1 is terminated. 

2. Bus Error Exception occurs. 

fe 3. CPU Cycle #2 continues on. 
CPU Cycle #1 DMA Cycle #2 1. CPU Cycle #1 is terminated. 

2. Bus Error Exception occurs. 

pee eee | 3. DMA Cycle #2 continues on. 


DMA Cycle #1 CPU Cycle #2 1. DMA Cycle #1 is terminated. 


2. CPU Cycle #2 continues on. 

3. DMA releases CPU Bus, disable itself (disable further requests 
until the interrupt routine re-enable the DMAC), and generate an 
interrupt. 

4. CPU cycles continues on. 


DMA Cycle #1 DMA Cycle #2 1. DMA Cycle #1 is terminated. 
2. DMA Cycle #2 continues on. 
3. DMAC releases CPU Bus, disable itself (disable further re- 
quests until the interrupt routine re-enable the DMAC), and gener- 
ate an interrupt. 
4. CPU cycles continue on. 


8.5.6.3 Bus Error Timing with No Pending Operation 


If there are no pending operations on the bus, BUSERR* is ignored at all times. 


8.5.6.4 Bus Error Timing with One Pending Operation 


If there is one pending operation on the bus, BUSERR* is sampled while waiting for the 
assertion of SYSAACK* or SYSDACK*. If BUSERR* is asserted, the bus cycle will con- 
tinue as if the SYSAACK* and/or the last SYSDACK* has been asserted. Figure 8-18, 
Figure 8-19, and Figure 8-20 illustrates the bus error associated with one pending opera- 
tion. In these figures, BUSERR* is ignored before CPUDSTART* and after BUSERR* as- 
serted because the bus is not waiting for the assertion of SYSAACK* nor SYSDACK*. 
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Figure 8-18. One Operation with BUSERR* as the Last SYSDACK* 
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Figure 8-19. One Operation with BUSERR* as SYSAACK* 
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Figure 8-20. One Operation with BUSERR* as SYSAACK* 
and the Last SYSDACK* 


8.5.6.5 Bus Error Timing with Two Pending Operations 


If there are two pending operations on the bus, BUSERR* is sampled while waiting for the 
assertion of SYSDACK*. If BUSERR* is asserted, the bus cycle will continue as if the last 
SY SDACK* has been asserted. The bus cycle will then proceed with the data phase of the 
next operation. The bus error that occurred is for the first pending operation. 


Figure 8-21 illustrates the bus error associated with two pending operations. In this figure, 
BUSERR* is ignored after BUSERR* asserted because the bus is no longer waiting for the 
assertion of SYSDACK* corresponding to operation AddrA with the bus error, and detec- 
tion of bus error for operation AddrB has not started until the assertion of CPUDSTART™. 
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Figure 8-21. Two Operations with Bus Error as the Last SYSDACK* 
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9. Performance Counter 


The performance counter provides the means for gathering statistical information about 
the internal events of the CPU and the pipeline during program execution. The statistics 


gathered during program execution aid in tuning the performance of hardware and 
software systems based on the processor. 
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9.1 Overview 


The performance counter consists of one control register and two counters. The control 
register controls the functions of the monitor while the counters count the number of 
events specified by the control register. 


9.2 Performance Counters and Performance Control Registers 


The Performance Counter Control Register, or PCCR, and Performance Counter Registers 
PCRO and PCR1 are mapped into COPO Register 25. Both the register and counters are 
read/write registers accessible by MTPC, MTPS, MTCO, MFPC, MFPS and MFCO 
instructions. Each counter is capable of counting one event as specified by the control 
register. 


The format of the PCCR is shown in Figure 9-1, and the format of PCRO and PCR1 is 
shown in Figure 9-2. 


31 30 29 28 27 26 25 24 23 22 21 20 19 15 1413121110 9 5 43 210 

C|0 /0 /0 |0 0 0 0 0 (0 |0 |0| EVENT1 |U/S|K/E/0; EVENTO |U|S|K|E |O 
T 1/11 |X 0/0 /0 Xx 
E L L 
1 0 

Ss ee 5 14 [4 [4 [1 5 ee 

Figure 9-1. Format of the Performance Counter Control Register PCCR 
31 30 0 


OVFL VALUE 


Figure 9-2. Format of Performance Counter Registers PCRO and PCR1 


The interpretation of the PCCR register bits is as follows: 


Table 9-1. PCCR Register Bits 


| Field =| Function Initial Value _| 
| CTE __| If 1, PCRO and PCR1 counting and exception generation isenabled. | 0 


S0/1 PCRO/1 counts event EVENTO/1 when in Supervisor mode. Undefined 
PCRO/1 counts event EVENTO/1 when in non-exception Kernel : 
Bon mode; i.e. with both STATUS.EXL and STATUS.ERL set to 0. Presid 


EXLO/1 PCRO/1 counts event EVENTO/1 when in Level 1 exception handler. Undefined 
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9.2.1. Accessing Counters and Registers 


The counter control register PCCR and the two performance counter registers PCRO and 
PCR1 are accessed by using MTCO* and MFCO* instructions. All three registers are 
mapped to COPO register 25. Table 9-2 illustrates how these registers are written by using 
the MTCO instruction, and Table 9-3 illustrates the encoding of the MF CO instructions 
used to read the registers. 

Table 9-4 show special mnemonics to access the performance Counters and Registers. 


Table 9-2. Writing Performance Counters and Registers using MTCO 


OpCode[15:11] | OpCode[1:0] | Operation 


11001 SES Move to Counter Control Register 
11001 Move to Performance Counter Register 0 


11001 
11001 Move to Performance Counter Register 1 


Table 9-3. Reading Performance Counters and Registers using MFCO 


F-11001 | 00] Move from Counter Contol Regisier 


Table 9-4. Mnemonics to Access the Performance Counters and Registers 


MTPC Move to Performance Counter 
MTPS Move to Performance Event Specifies 


MFPC Move from Performance Counter 
MFPS Move from Performance Event Specifies 


“MTPC, MTPS, MFPC and MFPS are the special encoding of MTCO and MFCO. 
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9.2.2 State of Performance Counter Control Registers Upon Reset 


The CTE bit of the Performance Counter Control Register PCCR is initialized to O upon 
reset. This prevents event counting and interrupt generation until the control registers 
are initialized. It also allows a precise way for counters to be initialized by software; see 


the section 9.3.2 for more details. Note that the remaining bits of PCCR and both registers 
PCRO and PCR1 must be initialized by software. 
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9.3 Counter Operation 


The performance counters PCRO and PCR1 increment by 1 whenever their corresponding 
count event occurs, and the counter is enabled. The count event for PCRO is specified by 
PCCR.EVENTO and the count event for PCRI1 is specified by PCCR.EVENTI1. The 
encoding of the EVENT field is specified in Table 9-5, and discussed in detail later. A 
counter is enabled only when both of the following conditions are satisfied: 


1. Theglobal counter enable flag PCCR.CTE is set to 1, and 


2. Thecurrent privilege mode matches the permitted privilege mode for each 
counter. The values in PCCR.UO, PCCR.SO, PCCR.KO, and PCCR.EXLO specify the 
permitted privilege modes for PCRO and PCCR.U 1. 

PCCR.S1, PCCR.K1, and PCCR.EXL1 specify the permitted privilege modes for 
PCR1. For example, if the current privilege mode is SUPERVISOR, PCROwill 
operate only if PCCR.SOis set to 1. Note that there is no “ERLO” or “ERL1” flag in 
PCCR. This is because counters are unconditionally disabled when in level 2 
handlers. 
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9.3.1 Counter Events 


A counter increments if it is enabled and its trigger event occurs. The permissible values 
for PCCR.EVENTO and PCCR.EVENTI1 are as shown in Table 9-5 below. The events are 
described in Section.9.3.1.1E vent Descriptions 


Table 9-5. Counter Events 


[Event [Countero Counter 

Po fresoved —SS~*di nrc sued 

Ce [isms SS~«*di ss 

[8 [Non-blockng lesdsiore | WBBbustrequest unavalabie | 

[2 [was single request ___——_—( WEB burtrequestaimostiul | 
CPU address bus busy CPU data bus busy 
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9.3.1.1. Event Descriptions 


In event descriptions, the word ‘branch’ (for example, ‘branch issued’, or ‘branch miss- 
predicted’) means any ‘transfer of control’ instruction that is subject to prediction (that is, 
all the conditional branch instructions, /, and /AL). The JR, /ALR, ERET, SYSCALL, 
BREAK, and TRAP instructions are not included. 


Branch issued This event is triggered whenever a branch is issued to a functional 
pipe. Note that a branch that is issued in a_ pipelined 
implementation may get canceled if an instruction prior to it 
signals an exception. 

Branch This event is triggered whenever the predicted branch address 

mispredicted (taken or not-taken) is incorrect. Note that a branch that is issued 
in a pipelined implementation may get canceled if an instruction 
prior to it signals an exception. 

BTAC miss This event is triggered whenever the instruction address lookup 
into the BTAC fails. Counts low-order (even) branch instructions 
that miss the BTAC. Note that high-order (odd) branch does not 


refer the BTAC. 
COP1 This event is triggered when a COP1 instruction completes. The 
instruction event is signaled even if the COP1 instruction completes 
completed successfully, but appears in the branch delay slot of a branch- 


likely instruction and is therefore nullified. 


CPU address Generates a signal once every BUSCLK (not CPU clock) that the 

bus busy CPU address bus is unavailable. The CPU address bus is 
considered unavailable whenever it is busy, or when two addresses 
have been issued but the data for the first address has yet to 
return. 


Data cachemiss This event is triggered whenever a data cache miss is detected. 
See Table 9-6. for the D$ miss definition. 


Table 9-6. Definition of Data Cache Miss 


[0 [a on. 
Load 


, [| Uncached, UCA 
Hit/Miss 


Uncached, UCA, Cached Hit 


Store | [Rinwached, UGA 
TIES 


Uncached, UCA, Cached Uncount * 


ret |, [Wneached, UGA 
HitMiss 


In this event, the data cache miss is defined as any load/store/pref 
instructions which may generate bus read operations to get missed data from 
external memory. 


* Prefetch to the Uncached or UCA page is considered as nop. 
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Barring canceled instructions, this event counts the total number 
of executed loads and stores. Thus, ‘data cache miss’ divided by 
‘DTLB accessed’ provide a good estimate of the D miss rate 
(assuming no uncached loads/stores occur). Also, ‘DTLB miss’ 
divided by ‘DTLB accessed’ provides the DTLB miss rate. DTLB is 
accessed even when unmapped page is accessed in case that minor 
revision number is 0x10 or later. 


This event is triggered whenever a DTLB miss is detected. DTLB 
is accessed even when unmapped page is accessed in case that 
minor revision number is Ox10 or later. 


This event is signaled whenever both functional pipes of the C790 
are issued instructions*. The event counter is incremented by 1. 


This event is triggered whenever an instruction cache miss is 
detected. 


This event triggers when an instruction completes. Note that some 
instructions (e.g. SYSCALL, TEQ, TEQI, etc.) signal exceptions as 
anormal part of their operation. Such instructions are considered 
complete whether or not the “normal” exception was raised. 
Therefore, an “instruction complete” event is signaled even if a 
TEQ succeeds (i.e. raises a Trap exception). However, if a “true” 
exception occurs (e.g. a counter exception is signaled while the 
TEQ is executing), the instruction is canceled and no “instruction 
complete” signal is generated. Similarly, an instruction in the 
branch delay slot (BDS) of a branch-likely instruction is counted 
as complete even if the BDS instruction is nullified. If the BDS 
instruction is canceled because of a “true” exception, no 
“instruction completed” event is signaled. 


C790 Implementation Note: Up to two instructions can complete 
every cyclein the C790. When two instructions do complete, the 
event counter is incremented by 2. 


This event is triggered whenever a |TLB miss is detected. 
This event is triggered whenever a J TLB miss is detected. 


This event triggers when a load instruction completes. Note that 
the event is signaled even if the load appears in the branch delay 
slot of a branch-likely instruction that is not taken and is therefore 
nullified. 


Counts the numbers of branches that were issued that appeared in 
the low-order (even) position of an instruction pair fetch. This 
count is needed since only these branches are subject to BTAC 
lookup. 


This “event” effectively disables the corresponding counter. It is 
useful principally if only one of the two counters need be activated. 


This event triggers when an instruction that does not have a 
branch delay slot completes. In particular, it does not trigger when 
a branch or jump instruction completes. However, it does trigger 
when the instruction in the branch delay slot of the branch or 
jump completes. In the case of a branch-likely instruction, the 
instruction in the branch delay slot triggers the event even if this 
instruction is nullified. Note: this event is useful for stepping over 
instructions. 


* (Dual instruction issued) *2 + (Single instruction issued) =instruction issued 
(Instruction issued) — (instruction completed) =instruction canceled 
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TOSHIBA 
Non-blocking This event is signaled whenever a cached load/stor e/pr ef 
load/store instruction misses on the Data Cache and there is no pending 


(1st cache miss): 


data cache miss, UCAB miss and uncached load. 


Processor cycle — This event triggers on every processor clock cycle. 

Single This event is signaled whenever only one of the functional pipes 

instruction of the C790 is issued an instruction*. 

issued 

Store completed This event triggers when a store instruction completes. Note that 
the event is signaled even if the store appears in the branch delay 
slot of a branch-likely instruction that is not taken and is 
therefore nullified. 

WBB Single A non-burst request was made to the WBB. 

Request 

WBB Burst A burst request was made to the WBB. 

Request 

WBB Single A non-burst request was made to the WBB, but there were 

Request insufficient free entries in the WBB to service it. All 8 entries are 

unavailable used at that time. 

WBB Burst A burst request was made to the WBB, but, the WBB was 

Request completely full, or there were not enough to service the request. 5, 

unavailable 6, 7, 8 entries are used at that time. 

WBB Burst A burst request was made to the WBB, and even though there 

Request almost were free entries, there were not enough to service the request. 5, 

full 6, 7 entries are used at that time. 

WBB Burst A burst request was made to the WBB, but the WBB was 

Request full completely full. All 8 entries are used at that time. 


* (Dual instruction issued) *2 + (Single instruction issued) =instruction issued 
(Instruction issued) — (instruction completed) =instruction canceled 
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9.3.2 Handling Performance Counter Exceptions 


A performance counter exception is detected by an instruction if the following condition 
holds true: 
~STATUS.ERL && PCCR.CTE && (CTRO.OVFL || CTR1.OVFL) 


Note that software should not rely on the exception occurring if the instruction is nullified; 
i.e. it appears in the branch delay slot of a branch likely instruction that is not taken. 


C790 Implementation Note: C790 implementation always counts events that occur within 
nullified instructions. 


The instruction detecting a counter exception is canceled by the exception, and instruction 
execution continues as follows: 


if ( in branch delay slot ) { 
ErrorEPC = PC - 4; 
CAUSE.BD2 = 1; 
} 
else { 
ErrorEPC = PC; 
CAUSE.BD2 = 0; 
} 
if ( STATUS.DEV ) 
PC = OxBFC00280; // Uncached counter xcp handler 
else 
PC = 0x80000080; // “Normal” counter xcp handler 
STATUS.ERL = 1; 
CAUSE.EXC2 = 2; // Counter exception 


The description above makes use of the BD2 and EXC2 fields in the CAUSE register. Both 
are fields newly introduced in the C790 and occupy the bit positions shown below. 


31 30 29 28 27 26 25 24 23 2221 20191817161514131211109 876 543210 


3 |B | Shi 

D| cE |o/olololololojolo| Exca/Plolo|'|pirlolojo| exc Jolo 
D ) 

2 7 > {3 {2 


Figure 9-3. CAUSE Register Fields 


C790 Programming Note: Note that the “normal” exception entry point is in ksegO space. 
That is, the address is unmapped and the caching policy is determined by CONFI!G.KO. If 
you don’t want to disturb the cache while counting and stepping, ksegO should be 
configured in “uncached” mode. If cache data preservation is secondary to counter 
exception servicing performance counter overflow, ksegO should be configured in “cached” 


mode. 
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9.3.3 Priority of Counter Exceptions 


Counter exceptions have the highest priority after cold reset and NMI. If a cold reset 
occurs the processor is initialized - so a simultaneous counter exception is discarded. If an 
NMI occurs, the NMI handler is entered with either PCRO.OVFL or PCR1.OVFL (or both) 
set to 1, and ErrorEPC pointing at the instruction causing the counter overflow. 
(ErrorEPC is used because NMI is handled as a level 2 exception.) Once the NMI handler 
exits, the instruction that caused the overflow is re-executed. However, since PCRO.OVFL 
or PCR1.OVFL is 1, the instruction is canceled once more and the counter exception 
handler is entered. 


9.3.4 Initializing Counters 


Let us look at the code sequence needed to initialize counters and activate them. In the 
example below, PCRO is set up to count clocks in all operating modes and report a counter 
exception after the count exceeds 231. CTR1 is set up to count stores while in supervisor 
mode only, and report a counter exception after the count exceeds 231. The code must be 
executed while in level 2 exception mode (ERL=1). 


STATUS.ERL = 1; // Set ERL (to inhibit counting) 
ErrorEPC = <target instruction where counting is to start> 
PCRO = 0; f/ Ini! -CTRO;. and... 

PCCR.EVENTO = 1; // .. set up to count clocks ... 
PCCR.UO = 1; // .. in all privilege modes 

PCCR.SO = 1; 

PCCR.KO = 1; 

PCCR.EXLO = 1; 

PCR1 = 0; Jf Init, PCRT1, and: 4. 

PCCR.EVENT1 = 15; // . set up to count completed stores ... 
PCCR.U1 = 0; // .. while in supervisor mode 

PCCR.S1 = 1; 

PCCR.K1 = 0; 

PCCR.EXL1 = 0; 


PCCR.CTE = 1; // Enable global counter flag 

ERET // Execute ERET to clear ERL - 
// counting begins with ERET’s target 
// Note that the ERET instruction also 
// guarantees that the COPO state 
// updated (e.g. CCR) is valid. 
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9.3.5 The Note to Read Counters 


Whenever you want to read a counter by MTCO or MTPC, be sure that any counting 
events must NOT occur, otherwise you may get wrong number. For example, counter for 
TLB event should be read in the unmapped area, that of instruction completion event 
should be read in the ERL=1 (level 2 exception) area or other disabled area. 


It is a implement-dependent that when the event is counted. It depends on the number of 
the pipeline stages and so on. 


To write a robust code among silicon versions and mask versions, you read the counters 
after flushing the pipeline by SYNC.P instruction. C790 is a pipeline processor. It is 
required for the instruction completion type event. 


It is a nature of event counting that some inaccuracy exists. You don’t need to be 
surprised if different number is observed in different version of silicon/mask. 
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10. Floating-Point Unit, CP1 (Option) 


This chapter describes the floating-point operations, including the programming model, 
instruction set and formats. 


The floating-point operations fully conform to the requirements of ANSI/IEEE Standard 
754-1985, /EEE Standard for Binary Floating-Point Arithmetic. 
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10.1 Overview 


All floating-point instructions, as defined in the MIPS ISA for the floating-point 
coprocessor, CP1, are processed by the other hardware unit that executes integer 
instructions. 


The floating point execution unit can be disabled by the coprocessor usability CU bit 
defined in the CPO Status register. 


10.2 Floating Point Register 


10.2.1 Floating-Point General Registers (FGRs) 


CP1 has a set of Floating-Point General Purpose registers (FGRs) that can be accessed in 
the following ways: 


As 32 general purpose registers (32 FGRs), each of which is 32 bits wide when the FR 
bit in the CPU Status register equals 0; or as 32 general purpose registers (32 FGRs), 
each of which is 64-bits wide when FR equals 1. The CPU accesses these registers 
through move, load, and store instructions. 


As 16 floating-point registers (see the next section for a description of FPRs), each of 
which is 64-bits wide, when the FR bit in the CPU Status register equals 0. The FPRs 
hold values in either single or double-precision floating-point format. Each FPR 
corresponds to adjacently numbered FGRs as shown in Figure 10-1. 


As 32 floating-point registers (see the next section for a description of FPRs), each of 
which is 64-bits wide, when the FR bit in the CPU Status register equals 1. The FPRs 
hold values in either single or doubleprecision floating-point format. Each FPR 
corresponds to an FGR as shown in Figure 10-1. 
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Floating-point ‘ : Floating-point 
Fl -P 
Registers (FPR) General Fee ‘isis Registers (FPR) 
(FR =0) iS g (FR=1) 


Floating-Point 
General Purpose Registers 


FPRO 


FPR2 


FPR28 


FPR30 


Floating-point 
Control Registers 
(FCR) 
Control/Status Register Implementation/Revision Register 
31 FCR31 0 ot FCRO 0 


Figure 10-1. FP Registers 
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10.2.2 Floating-Point Registers (FPRs) 
The FPU provides: 
e 16 Floating-Point registers (FPRs) when the FR bit in the Status register equals 0, or 
e §=32 Floating-Point registers (FPRs) when the FR bit in the Status register equals 1. 


These 64-bit registers hold floating-point values during floating-point operations and are 
physically formed from the General Purpose registers (FGRs). When the FR bit in the 
Status register equals 1, the FPR references a single 64-bit FGR. 


The FPRs hold values in either single: or double-precision floating-point format. If the FR 
bit equals 0, only even numbers (the /east register) can be used to address FPRs. When 
the FR bit is set toa 1, all FPR register numbers are valid. 


If the FR bit equals 0 during a double-precision floating-point operation, the general 
registers are accessed in double pairs. Thus, in a doubleprecision operation, selecting 
Floating-Point Register O (FPRO) actually addresses adjacent Floating-Point General 
Purpose registers FGRO and FGR1. 


10.2.3 Floating-Point Control Registers 


The MIPS RISC architecture defines 32 floating-point control registers (FCRs); the C790 
processor implements two of these registers: FCRO and FCR31. These FCRs are described 
below: 


e The/mplementation/Revision register (FCRO) holds revision information. 


e The Control/SStatus register (FCR31) controls and monitors exceptions, holds the 
result of compare operations, and establishes rounding modes. 


e FCR1toFCR30 are reserved. 


Table 10-1 lists the assignments of the FCRs. 


Table 10-1. Floating-Point Control Register Assignments 


FCRO Coprocessor implementation and revision register 


FCR1 to FCR30 Reserved 
FCR31 Rounding mode, cause, trap enables, and flags 
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Implementation and Revision Register (F CRO) 


The read-only /mplementation and Revision register (FCRO) specifies the implementation 
and revision number of CP1. This information can determine the coprocessor revision and 
performance level, and can also be used by diagnostic software. 


Figure 10-2 shows the layout of the register; Table 10-2 describes the /mplementation and 
Revision register (FCRO) fields. 


Implementation/Revision Register (FCRO) 
31 16 15 87 


es Imp Rev 


16 8 8 
Figure 10-2. Implementation/Revision Register 


Table 10-2. FCRO Fields 


Implementation number 


Revision number in the form of y. x 
jo | Reserved. Returns zeroes when read. ft 


The revision number is a value of the form y. x, where: 
e = yis a major revision number held in bits 7:4. 
e xisaminor revision number held in bits 3:0. 


The revision number distinguishes some chip revisions; however, there is not guarantee 
that changes to its chips are necessarily reflected by the revision number, or that changes 
to the revision number necessarily reflect real chip changes. For this reason revision 
number values are not listed, and software should not rely on the revision number to 
characterize the chip. 


IEEE Standard 754 


IEEE Standard 754 specifies that floating-point operations detect certain exceptional 
cases, raise flags, and can invoke an exception handler when an exception occurs. These 
features are implemented in the MIPS architecture with the Cause Enable and Flag 
fields of the Control/Status register. The Flag bits implement IEEE 754 exception status 
flags, and the Cause and Enable bits implement exception handling. 
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Control/Status Register (F CR31) 


The Control/Status register (FCR31) contains control and status information that can be 
accessed by instructions in either Kernel or User mode. FCR31 also controls the 
arithmetic rounding mode and enables User mode traps, as well as identifying any 
exceptions that may have occurred in the most recently executed floating-point instruction, 
along with any exceptions that may have occurred without being trapped. 


Figure 10-3 shows the format of the Control/Status register, and Table 10-3 describes the 
Control/Status register fields. Figure 10-4 shows the Control/Status register Cause Flag, 
and Enable fields. 


Control/Status Register (FCR31) 


31 25 24 23 22 18 17 12 11 76 21 0 
Cause Enables Flags 
RM 
|e fst] oo EVZOUI VZOUI VZOUI a 
7 1 1 5 6 5 5 2 


Figure 10-3. FP Control/Status Register Bit Assignments 


Table 10-3. Control/Status Register Fields 


FS When set, denormalized results can be flushed instead of causing 
an unimplemented operation exception. 


Condition bit. See description of Control/Status register Condition 


Cause bits. See Figure 10-4 and the description of Control/Status 
register Cause, Flag, and Enable bits. 

Enable bits. See Figure 10-4 and the description of Control/Status 
register Cause, Flag, and Enable bits. 

Flags Flag bits. See Figure 10-4 and the description of Control/Status 
register Cause, Flag, and Enable bits. 


RM Rounding mode bits. See Table 10-5 and the description of 
Control/Status register Rounding Mode Control bits. 


10-6 


TOSHIBA Chapter 10 Floating-Point Unit, CP1 We ” 


Bit# 17 16 15 14 #«+18 12 


Bit# 11 10 Enable 
Be a oT Bits 
Bit# 6 5 4 3 2 Flag 


Inexact Operation 
Underflow 
Overflow 
Division by Zero 
Invalid Operation 
Unimplemented Operation 


Figure 10-4. Control/Status Register Cause, Flag, and Enable Fields 


Control/Status Register FS Bit 


The FS bit enables the flushing of denormalized values. When the FS bit is set and the 
Underflow and Inexact Enable bits are not set, denormalized results are flushed instead of 
causing an Unimplemented Operation exception. Results are flushed to either O or the 
minimum normalized value, depending upon the rounding mode (see Table 10-4 below), 
and the Underflow and I nexact of the Cause and Flag bits are set. 


Table 10-4. Flush Values of Denormalized Results 


Denormalized Flushed Result Rounding Mode 


Resut_ {pn | ez {re | aM 


pega | te a eee 
[Negative | 0 | o | 0 | 2 | 


Control/Status Register Condition Bit 


When a floating-point Compare operation takes place, the result is stored at bit 23, the 
Condition bit. The C bit is set to 1 if the condition is true; the bit is cleared to 0 if the 
condition is false. Bit 23 is affected only by compare and CTC1 instructions. 
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Control/Status Register Cause, Flag, and Enable Fields 


Figure 10-4 illustrates the Cause, Flag, and Enable fields of the Control/Status register. 
The Cause and Flag fields are updated by all conversion, computational (except MOV. fmt), 
CTC1, reserved, and unimplemented instructions. All other instructions have no affect on 
these fields. 


Cause Bits 


Bits 17:12 in the Control/Status register contain Cause bits, as shown in Figure 
10-4, which reflect the results of the most recently executed floating-point 
instruction. The Cause bits are a logical extension of the CPO Cause register; they 
identify the exceptions raised by the last floating-point operation. If the 
corresponding Enable bit is set at the time of the exception a floating-point 
exception is raised and trapped by CPU. If more than one exception occurs on a 
single instruction, each appropriate bit is set. 


The Cause bits are updated by most floating-point operations. The Unimplemented 
Operation (E) bit is set to 1 if software emulation is required, otherwise it remains 0. 
The other bits are set to O or 1 to indicate the occurrence or non-occurrence 
(respectively) of an IEEE 754 exception. Within the set of floating-point 
instructions that update the Cause bits, the Cause field indicates the exceptions 
raised by the most-recentl y-executed instruction. 


When a floating-point exception is taken, no results are stored, and the only state 
affected is the Cause bit. 


Enable Bits 


A floating-point exception is generated any time a Cause bit and the corresponding 
Enable bit are set. A floating-point operation that sets an enabled Cause bit forces 
an immediate floating-point exception, as does setting both Cause and Enable bits 
with CTC1. 


There is no enable for Unimplemented Operation (E). An Unimplemented exception 
always generates a floating-point exception. 


Before returning from a floating-point exception, software must first clear the 
enabled Cause bits with a CTC1 instruction to prevent a repeat of the exception 
trapping. Thus, User mode programs can never observe enabled Cause bits set; if 
this information is required in a User mode handler, it must be passed somewhere 
other than the Status register. 


For a floating-point operation that sets only unenabled Cause bits, no floating-point 
exception occurs and the default result defined by IEEE 754 is stored. In this case, 
the exceptions that were caused by the immediately previous floating-point 
operation can be determined by reading the Cause field. 
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Flag Bits 


The Flag bits are cumulative and indicate the exceptions that were raised by the 
operations that were executed since the bits were explicitly reset. Flag bits are set 
to lif an IEEE 754 exception is raised, otherwise they remain unchanged. The Flag 
bits are never cleared as a side effect of floating-point operations; however, they can 
be set or cleared by writing a new value into the Status register, using a CTC1 
instruction. 


When a floating-point exception is trapped, the flag bits are not set by the 
hardware; floating-point exception software is responsible for setting these bits 
before invoking a user handler. 


Control/Status Register Rounding Mode Control Bits 
Bits 1 and Oin the Control/Status register constitute the Rounding Mode (RM) field. 


As shown in Table 10-5, these bits specify the rounding mode that CP1 uses for all 
floating-point operations. 


Table 10-5. Rounding Mode Bit Decoding 


Rounding 
ModeRM | Mnemonic Description 
(1:0) 
Round result to nearest representable value; 
round to value with least-significant bit 0 
when the two nearest representable values 
are equally near. 


Round toward 0: round to value closest to 
and not greater in magnitude than the 
infinitely precise result. 


ee ed 
and not less than the infinitely precise result. 
3 RM Round toward —ce: round to value closest to 
and not greater than the infinitely precise 
result. 


10.2.4 Accessing the FP Control and Implementation/Revision 
Registers 


The Control/Status and the /mplementation/Revision registers are read by a Move Control 
From Coprocessor 1 (CFC1) instruction. 


The bits in the Control/Status register can be set or cleared by writing to the register 
using a Move Control To Coprocessor 1 (CTC1) instruction. The /mplementation/Revision 
register is a read-only register. There are no pipeline hazards (between any instructions) 
associated with floating-point control registers. 
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10.3 Floating-Point Formats 


CP1 performs both 32-bit (single-precision) and 64-bit (double-precision) IEEE standard 
floating-point operations. The 32-bit singleprecision format has a 24-bit signed- 
magnitude fraction field (f+s) and an 8-bit exponent (e), as shown in Figure 10-5. 


31 30 23 22 0 
Sign Exponent Fraction 
1 8 23 


Figure 10-5. Single-Precision Floating-Point Format 


The 64-bit double-precision format has a 53-bit signed-magnitude fraction field (f+s) and 
an 11-bit exponent, as shown in Figure 10-6. 


63 62 5251 0 
Sign Exponent Fraction 
1 11 52 


Figure 10-6. Double-Precision Floating-Point Format 

As shown in the above figures, numbers in floating-point format are composed of three 
fields: 
e sign field, s 
e biased exponent, e= E + bias 
e = fraction, f= 61b2....Op-1 

where bias =127, p =24 in single precision, 

bias =1023, p =53 in double precision 


The range of the unbiased exponent E includes every integer between the two values Emin 
and Emax inclusive, together with two other reserved values: 


e Emin—1 (to encode 0 and denormalized numbers) 
e = Emax+ 1 (to encode «» and NaNs [Not a Number]) 


For single-and double-precision formats, each representable nonzero numerical value has 
just one encoding uniquely. 


For single-and double-precision formats, the value of a number, v, is determined by the 
equations shown in Table 10-6. 
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Table 10-6. Equations for Calculating Values in Single and Double-Precision Floating-Point Format 


Condition 


E=Emaxt1 and f #0, regardless of s 
E=Emaxt1 and f=0 


Emin < E < Emax 


E=Emin-1 and f=0 


For all floating-point formats, if vis NaN, the most-significant bit of f determines whether 
the value is a signaling or quiet NaN: visa signaling NaN if the most-significant bit of fis 


set, otherwise, vis a quiet NaN. 


Table 10-7 defines the values for the format parameters; minimum and maximum 
floating-point values are given in Table 10-8. 


Table 10-7. Floating-Point Format Parameter Values 


Parameter 


Emax 


Format 
Single Double 


Emin 


Exponent bias 


Exponent width in bits 


Integer bit 


Fraction width in bits 


Format width in bits 


tT Excluding the sign bit. 


Table 10-8. Minimum and Maximum Floating-Point Values 


Float Minimum 


1.40129846e° 


Float Minimum Norm 


1.17549435e°8 


Float Maximum 


3.40282347e7° 


Double Minimum 


4.94065645841 246546 4 


Double Minimum Norm 


2.2250738585072014e °° 


Double Maximum 


1.7976931348623157e°% 
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10.4 Binary Fixed-Point Format 


Binary fixed-point values are held in 2’s complement format. Unsigned fixed-point values 
are not directly provided by the floating-point instruction set. Figure 10-7 illustrates 
binary word fixed-point format and Figure 10-8 illustrates binary long fixed-point format; 
Table 10-9 lists the binary fixed-point format fields. 


31 30 0 
1 31 
Figure 10-7. Binary Word Fixed-Point Format 
63 62 0 
1 63 


Figure 10-8. Binary Long Fixed-Point Format 


Field assignments of the binary fixed-point format are: 


Table 10-9. Binary Fixed-Point Format Fields 


integer value (2’s complement) 
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10.5 Floating-Point Instruction Set Summary 


Each instruction is 32 bits long, and aligned on a word boundary. This section describes 
the overview of instructions for floating-point unit. A detailed description of each 
instruction is provided in Appendix D. 


10.5.1 Load, Store and Move Instructions (Table 10-10) 


Load and Store instructions move data between memory and FPU general purpose 
registers(F GR), and Move instructions move data directly between CPU and FPU general 
purpose registers(FGR). These instructions are not perform format conversions and 
therefore never cause floating-point exceptions. The instruction immediately following a 
load can use the contents of the loaded register. However, in such case the hardware 
interlocks, requiring additional real cycles. Thus, the scheduling of load delay slots is 
required to avoid the interlocking. 


Table 10-10. FPU Instruction Set (Optional): Load, Move and Store Instruction 


Move Word from FPU (coprocessor 1) 


Store Doubleword from FPU (coprocessor1) 
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10.5.2 Conversion Instructions (Table 10-11) 


Conversion instructions perform conversion operations between the various data formats. 


Table 10-11. FPU Instruction Set(Optional): Conversion Instruction 


Instruction Description Note 
CVT.S.fmt Floating-Point Convert to Single FP Format MIPS | 
CVT.W.fmt Floating-Point Convert to Word Fixed-Point Format MIPS | 


CEIL.W.fmt Floating-point Ceiling Convert to Word Fixed-Point MIPS II 


ROUND.L.fmt Floating-point Round to Long Fixed-Point MIPS III 
TRUNC.L.fmt Floating-point Truncate to Long Fixed-Point MIPS III 


CEIL.L.fmt Floating-point Ceiling Convert to Long Fixed-Point MIPS III 


FLOOR.L.fmt Floating-point Floor Convert to Long Fixed-Point MIPS Ill 


10.5.3 Computational Instructions (Table 10-12) 


Computational instructions perform arithmetic operations on floating-point values in the 
FPU registers. These are two categories of computational instructions: 


e 3-Operand Register-Type instructions, which perform floating-point addition, 
subtraction multiplication, and division operations 


e 2-Operand Register-Type instructions, which perform floating-point abusolute value, 
move, negate, and square root operations. 


Table 10-12. FPU Instruction Set(Optional): Computational Instruction 


ADD.fmt Floating-point Add MIPS | 
SUB.fmt Floating-point Subtract MIPS | 
MUL.fmt Floating-point Multiply MIPS | 


DIV.fmt Floating-point Divide MIPS | 
ABS.fmt Floating-point Absolute Value MIPS | 
MOV.fmt Floating-point Move MIPS | 


NEG.fmt Floating-point Negate MIPS | 


SQRT.fmt Floating-point Square root MIPS II 
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10.5.4 Compare and Branch Instructions (Table 10-13) 


Compare instructions perform comparisons of the contents of registers and set a 
conditional bit based on the results. Branch on FPU Condition instructions perform a 
branch to the specified target if the specified coprocessor condition is met. 


Table 10-13. FPU Instruction Set(Optional): Compare and Branch Instruction 


Floating-point Compare MIPS | 


BC1T Branch on FPU True MIPS | 


BC1F Branch on FPU False MIPS | 
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11. Floating-Point Exception (Option) 


This chapter describes FPU floating-point exceptions, including FPU exception types, 
exception trap processing, exception flags, saving and restoring state when handling an 
exception, and trap handlers for IEEE Standard 754 exceptions. 


A floating-point exception occurs whenever the FPU cannot handle either the operands or 
the results of a floating-point operation in its normal way. The FPU responds by 
generating an exception to initiate a software trap or by setting a status flag. 
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11.1 Introduction 


This chapter describes floating-point exceptions, including FPU exception type, exception 
trap processing, exception flags, saving and restoring state when handling an exception, 
and trap handlers for IEEE Standard 754 exceptions. 


11.2 Exception Types 


The FP Control/Status register described in Chapter 10 contains an Enable bit for each 
exception type; exception Enable bits determine whether an exception will cause the F PU 
to initiate a trap or set a status flag. 


e If a trap is taken, the FPU remains in the state found at the beginning of the 
operation and a software exception handling routine executes. 


e If notrap is taken, an appropriate value is written into the FPU destination register 
and execution continues. 


The FPU supports the five |EEE Standard 754 exceptions: 
e =| nexact (I) 

e Underflow (U) 

e Overflow (O) 

e Division by Zero (Z) 

e Invalid Operation (V) 

Cause bits, Enables, and Flag bits (status flags) are used. 


The FPU adds a sixth exception type, Unimplemented Operation (E). This exception 
indicates the use of a software implementation. The Unimplemented Operation exception 
has no Enable or Flag bit; whenever this exception occurs, an unimplemented exception 
trap is taken. 


Figure 11-1 shows the Control/Status register bits that support exceptions. 


Bit # 17 16 15 14 13 12 
p= PE 2 et a aie ais 

| | | | | 

Bit # 11 10 9 8 7 
Pov Tz tt Enable Bits 

| | | | | 

Bit # 6 5 4 3 2 
pe i il) ea Bis 

| | | | | | 

Unimplemented Invalid Division by Overflow Underflow Inexact 


Zero 


Figure 11-1. Control/Status Register Exception/Flag/Trap/Enable Bits 
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11.3 Exception Trap Processing 


When a floating-point exception trap is taken, the Cause register indicates the floating- 
point coprocessor is the cause of the exception trap. 


The Floating-Point Exception (F PE) code is used, and the Cause bits of the floating-point 
Control/Status register indicate the reason for the floating-point exception. These bits are, 
in effect, an extension of the system coprocessor Cause register. 


11.4 Flags 


A Flag bit is provided for each IEEE exception. This Flag bit is set toa 1 on the assertion 
of its corresponding exception, without corresponding exception trap signaled. 


The Flag bit is reset by writing a new value into the Status register; flags can be saved 
and restored by software either individually or as a group. 


When no exception trap is signaled, floating-point coprocessor takes a default action, 
providing a substitute value for the exception-causing result of the floating-point 
operation. The particular default action taken depends upon the type of exception. Table 
11-1 lists the default action taken by the FPU for each of the IEEE exceptions. 


Table 11-1. Default FPU Exception Actions 


Rounding 


Inexact exception Supply a rounded result 


ls Modify underflow values to 0 with the sign of the intermediate result 
Modify underflow values to 0 with the sign of the intermediate result 


Modify positive underflows to the format’s smallest positive finite 
number; modify negative underflows to —0. 


RM Modify negative underflows to the format’s smallest negative finite 
number; modify positive underflows to 0. 


of the intermediate result 
modify positive overflows to +°° 


Underflow exception 


Overflow exception 


An 


Invalid operation 


Supply 2°' -1 result (Word Fixed-Point); 
Supply 2°” -1 result (Long Fixed-Point); 
Otherwise supply a quiet Not a Number 


RM Modify positive overflows to the format’s largest finite number; modify 
negative overflows to —°° 
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The FPU detects the eight exception causes internally. When the FPU encounters one of 
these unusual situations, it causes either an IEEE exception or an Unimplemented 
Operation exception (E). 


Table 11-2 lists the exception-causing situations and contrasts the behavior of the FPU 
with the requirements of the IEEE Standard 754. 


Table 11-2. FPU Exception-Causing Conditions 


IEEE 
FPA Internal Stands Trap Trap Notes 
Result 754 Enable Disable 


inexact resut | | | ot Tt | hossofaccuracy 
spore overflow Et eee Bee Normalized exponent > Emax 
ic Zero is (exponent=Enia -1, mantissa=0) 


Overflow on convert . 
V y (x?) y (x?) Source out of integer range, 0, NaN 
to Integer 


Signaling NaN 
source 
a a a ae eee 


Denormalized or None Denormalized is (exponent=Emin —1 and 
QNaN mantissa <> 0) 


The IEEE Standard 754 specifies an inexact exception on overflow only if the overflow trap is 
disabled. 


(*2) Some implementations such as TX49 trap as (E) and SW support is requred. In TX79 
implementation there is NO SW support required. 


(*3) Exponent underflow sets the U and | Cause bits if both the U and | Enable bits are not set and the 
FS bit is set; otherwise exponent underflow sets the E Cause bit. 
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11.5 FPU Exceptions 


The following sections describe the conditions that cause the FPU to generate each of its 
exceptions, and details the FPU response to each exception-causing condition. 


Inexact Exception (I) 

The FPU generates the | nexact exception if one of the following occurs: 
e therounded result of an operation is not exact, or 

e therounded result of an operation overflows, or 


e the rounded result of an operation underflows and both the Underflow and | nexact 
Enable bits are not set and the FS bit is set. 


Trap Enabled Results: If Inexact exception traps are enabled, the result register is not 
modified and the source registers are preserved. 


Trap Disabled Results: The rounded or overflowed result is delivered to the destination 
register if no other software trap occurs. 
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Invalid Operation Exception (V) 
Floating-Point format operation 


The Invalid Operation exception is signaled if one or both of the operands are invalid for 
an implemented operation. When the exception occurs without a trap, the MIPS ISA 
defines the result as a quiet Not a Number (QNaN) for Floating-Point format. The 
invalid operations are: 


e Addition or subtraction: magnitude subtraction of infinities, such as: (+ 0°) + (-s2) or 
(00) — (-cv) 


e Multiplication: 0 times .2, with any signs 
e Division: 0/0, or c/oo, with any signs 


e Comparison of predicates involving ‘< or ‘> without ‘?’, when the operands are 
unordered« 


e Any arithmetic operation, when one or both operands is a signaling NaN. A move 
(MOV) operation is not considered to be an arithmetic operation, but absolute value 
(ABS) and negate (NEG) are considered to be arithmetic operations. 


e Comparison or Convertion From Floating-point Format on a signaling NaN. 
e Square root: /x , where x is less than zero. 


Software can simulate the Invalid Operation exception for other operations that are 
invalid for the given source operands. Examples of these operations include IEEE 
Standard 754-specified functions implemented in software, such as Remainder: x REM 
y, where y is 0 or xis infinite; conversion of a floating-point number to a decimal format 
whose value causes an overflow, is infinity, or is NaN; and transcendental functions, 
such as In (—5) or cos! (3). Refer to Appendix D for examples or for routines to handle 
these cases. 


Trap Enabled Results: The result register is not modified, and the source registers are 
preserved. 


Trap Disabled Results: A quiet NaN is delivered to the destination register if no other 
software trap occurs. 


Conversion to Integer format 


The Invalid Operation exception is also raised when the source operand is an Infinity 
(co) or NaN, or the correctly rounded integer result is outside of the representable range. 


Trap Enabled Results: The result register is not modified, and the source registers are 
preserved. 


Trap Disable Results: The result value 231 —1 (for Word Fixed-Point) or 26 —1 (for 
Long Fixed-Point) is delivered to the destination register if no 
other software trap occurs. 


* ‘<, ‘> and ‘?’ are the notation in IEEE std 754. 
‘?’ means ‘unordered.’ See Compare instruction in Appendix D. 
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Division-by-Zero Exception (Z) 


The Division-by-Zero exception is signaled on an implemented divide operation if the 
divisor is zero and the dividend is a finite nonzero number. Software can simulate this 
exception for other operations that produce a signed infinity, such as In (0), sec (x/2), csc 
(0), or 04 


Trap Enabled Results: The result register is not modified, and the source registers are 
preserved. 


Trap Disabled Results: Theresult, when no trap occurs, is a correctly signed infinity. 
Overflow Exception (O) 


The Overflow exception is signaled when the magnitude of the rounded floating-point 
result, with an unbounded exponent range, is larger than the largest finite number of the 
destination format. (This exception also signals an | nexact exception.) 


Trap Enabled Results: The result register is not modified, and the source registers are 
preserved. 


Trap Disabled Results: The result, when no trap occurs, is determined by the rounding 
mode and the sign of the intermediate result (See Table 11-3). 


Table 11-3. Values of Overflow Results 


Denormalized Flushed result Rounding Mode 
Result | RN [rz Tere | 


+Emax 
| Negative |e | max | -emax | 


Underflow Exception (U) 
Two related events contribute to the Underflow exception: 


e creation of a tiny nonzero result between +2—™n which can cause some later exception 
because it is so tiny 


e extraordinary loss of accuracy during the approximation of such tiny numbers by 
denormalized numbers. 


IEEE Standard 754 allows a variety of ways to detect these events, but requires they be 
detected the same way for all operations. 


Tininess can be detected by one of the following methods: 


e after rounding (when a nonzero result, computed as though the exponent range were 
unbounded, would lie strictly between +2™In) 


e before rounding (when a nonzero result, computed as though the exponent range and 
the precision were unbounded, would lie strictly between +2E™In), 


The MIPS architecture requires that tininess be detected after rounding. 


Loss of accuracy can be detected by one of the following methods: 
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e denormalization loss (when the delivered result differs from what would have been 
computed if the exponent range were unbounded) 


e inexact result (when the delivered result differs from what would have been computed 
if the exponent range and precision were both unbounded). 


The MIPS architecture requires that loss of accuracy be detected as an inexact result. 


Trap Enabled Results: If Underflow or Inexact traps are enabled, or if the FS bit is not 
set, then an Unimplemented exception (E) is generated, and the 
result register is not modified and the source registers are 
preserved. 


Trap Disabled Results: If Underflow and Inexact traps are not enabled and the FS bit is 
set, the result is determined by the rounding mode and the sign 
of the intermediate result (See Table 10-4). 


Unimplemented Instruction Exception (E) 


Any attempt to execute an instruction with an operation code or format code that has been 
reserved for future definition sets the Unimplemented bit in the Cause field in the FPU 
Control/Status register and traps. The operand and destination registers remain 
undisturbed and the instruction is emulated in software. Any of the IEEE Standard 754 
exceptions can arise from the emulated operation, and these exceptions are simulated. 


The Unimplemented Instruction exception can also be signaled when unusual operands or 
result conditions are detected that the implemented hardware cannot handle properly. 
These include: 


e Denormalized operand, except for Compare instruction 
¢ Quiet Not a Number operand, except for Compare instruction 


e Denormalized result or Underflow, when either Underflow or |nexact Enable bit is set 
or the FS bit is not set. 


e Reserved opcodes 
e Unimplemented formats 
e Operations which are invalid for their format (for instance, CVT.S.S) 


NOTE: Denormalized and NaN operands are only trapped if the instruction is a convert or a 
computational operation. A move opration does not trap if their operands are either 
denormalized or NaNs. 


The use of this exception for such conditions is optional; most of these conditions are 
newly developed and are not expected to be widely used in early implementations. 
Loopholes are provided in the architecture so that these conditions can be implemented 
with assistance provided by software, maintaining full compatibility with the IEEE 
Standard 754. 


Trap Enabled Results: The result register is not modified, and the source registers are 
preserved. 


Trap Disabled Results: This trap cannot be disabled. 
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11.6 Saving and Restoring State 


Sixteen doubleword' coprocessor load or store operations save or restore the coprocessor 
floating-point register state in memory. The remainder of control and status information 
can be saved or restored through CFC1/CTC1 instructions, and saving and restoring the 
processor registers. Normally, the Control/Status register is saved first and restored last. 


When state is restored, state information in the Control/Status register indicates the 
exceptions that are pending. Writing a zero value to the Cause field of Control/Status 
register clears all pending exceptions, permitting normal processing to restart after the 
floating-point register state is restored. 


11.7 Trap Handlers for IEEE Standard 754 Exceptions 


The IEEE Standard 754 strongly recommends that users be allowed to specify a trap 
handler for any of the five standard exceptions so that a software subroutine can return a 
value to be used in stead of the exceptional operation’s result; the trap handler can either 
compute or specify a substitute result to be placed in the destination register of the 
operation. 


By retrieving an instruction using the processor Exception Program Counter (EPC) 
register, the trap handler determines: 


e exceptions occurred during the operation 
e the operation being performed 
e thedestination format 


On Overflow or Underflow exceptions (except for conversions), and on Inexact exceptions, 
the trap handler gains access to the correctly rounded result by decoding source register 
field of the instruction code and simulating the operation in software. 


On Overflow or Underflow exceptions caused by a floating-point conversion, on Invalid 
Operation and on Division-by-Zero exceptions, the trap handler gains access to the 
operand values by decoding the source register field of the instruction code. 


The IEEE Standard 754 recommends that, if enabled, the overflow and underflow traps 
take precedence over a separate inexact trap. This prioritization is accomplished in 
software; hardware sets the bits for both the Inexact exception and the Overflow or 
Underflow exception. 


1 32 doublewords if the FR bit is set to 1. 
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12. PC Trace 


This chapter describes the trace functions present on the C790. 


The C790 supports real-time PC tracing. Pipeline status, target addresses of indirect 
jumps, and exception vectors are made available on special signals. The executed 
instruction sequence can be restored from signals and the source program. 


The C790 also supports hardware breakpoints. The breakpoint facility is described in 
Chapter 13. 
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12.1 Real-Time PC Tracing 


Trace information and non-sequential Program Counters are made available on special 
signal lines of the CPU. 


The following trace information is made available: 


e Instruction being executed in pipeline O 

e Instruction being executed in pipeline 1 

e Current execution status (Normal (Sequential), Branch Taken, J ump Target, 
Exception Target) 


For Indirect jumps, the target address is also made available. For exception vectors, a code 
for the exception vector address is made available. 


12.1.1 Classification of Branch and Jump Instructions 


In this chapter, branches and jumps are classified into three categories which are direct 
jump, indirect jump and branch in order to explains the function of PC trace. 
The classification is show in Table 12-1. 


Table 12-1. Classification of Branch and Jump Instruction 


pClass | instruction 
Jump Direct or Indirect Jump 
Direct Jump J or JAL Instruction 


Indirect Jump JR, JALR or ERET Instruction 


Any of conditional branch Instruction 


12-2 


TX 
TOSHIBA Chapter 12 PC Trace es” 


12.1.2 PC Trace Signals 


All PC trace signals operate at half the C790 CPU clock frequency using the BUSCLK 
clock signal. Because of the half frequency operation there are pairs of signals which 
indicate the status of execution within the CPU pipelines. Phase A signals show the status 
corresponding to the even CPU clock cycle and Phase B signals show the status 
corresponding to the odd CPU clock cycle. 


As can be seen from the following figure the execution status of the CPU pipeline during 
time 0 (all time references are in relation to the CPU clock) is put on the phase A signals 
at the next rising edge of BUSCLK during time 2. Similarly the execution status of the 
CPU pipeline during time 1 is put on the phase B signals. 


Time 0 1 2 3 4 5 6 7 8 9 10 


Phase 


CPUCLK 


= 
= 
a 
= 
s 
= 
& 
= 
& 
= 
rs 


BUSCLK 
i a a a ae a ee 

Signals 

Sinas, XXX OX OX 

Signals 


The following signals are made available for real-time PC tracing. 


e POEXEA* (Phase A Pipeline 0 Execution Status) Output 
e PLEXEA* (Phase A Pipeline 1 Execution Status) Output 
e JMPA* (Phase A J ump) Output 
e POEXEB* (Phase B Pipeline 0 Execution Status) Output 
e P1LEXEB* (Phase B Pipeline 1 Execution Status) Output 
e |)MPB* (Phase B J ump) Output 
e TPCE* (Target PC Enable) Output 
e TPC[3:0] (Target PC Bus) Output 
(1) POEXEA* (Phase A Pipeline 0 Execution Status) Output 


POE XEA indicates whether an instruction has completed execution without generating an 
exception (retired) via Pipeline 0 during phase A. 


0: An instruction was retired. 
1: Noinstruction was retired. 
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(2) P1IEXEA* (Phase A Pipeline 1 Execution Status) Output 


P1E XEA indicates whether an instruction retired via Pipeline 1 during phase A. Note if 
this signal is asserted at the same time as POE XEA* then two instructions were retired 
simultaneously during phase A via pipelines 0 and 1 but there is no indication as to which 
specific instruction was retired via which pipeline. 


0: An instruction was retired. 
1: Noinstruction was retired. 
(3) JMPA* (Jump Phase A) Output 
A jump was retired during phase A or a conditional branch instruction was retired and the 
branch was taken during phase A. Note that exceptions do not assert this signal. 


0: J ump or conditional branch instruction was retired. 
1: NoJ ump or conditional branch instruction was retired. 
(4) POEXEB* (Phase B Pipeline 0 Execution Status) Output 
POE XEB indicates whether an instruction retired via Pipeline O during phase B. 


0: An instruction was retired. 
1: Noinstruction was retired. 


(5) P1EXEB* (Phase B Pipeline 1 Execution Status) Output 


P1EXEB indicates whether an instruction retired via Pipeline 1 during phase B. Note if 
this signal is asserted at the same time as POE XEB* then two instructions were retired 
simultaneously during phase B via pipelines 0 and 1 but there is no indication as to which 
specific instruction was retired via which pipeline. 


0: An instruction was retired. 
1: Noinstruction was retired. 


(6) JMPB* (Jump Phase B) Output 


A jump was retired during phase B or a conditional branch instruction was retired and the 
branch was taken during phase B. Note that exceptions do not assert this signal. 


0: J ump or conditional branch instruction was retired. 
1: NoJ ump or conditional branch instruction was retired. 
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(7) TPCE* (Target PC Enable) Output 


When this signal is asserted the TPC bus indicates the type of target PC that will be made 
available. 


0: TPC bus indicates type of target PC. 
1: TPC bus has either the target PC or the exception vector address code 
or has no information. 


The normal sequence of operation for the TPCE* and the TPC[3:0] signals is as follows: 
First TPCE* is asserted and simultaneously TPC[3:0] contains information about the type 
of the target PC (non-sequential PC). Next TPCE* is deasserted and either the target PC 
for indirect jumps is made available on the TPC[3:0] bus or for exceptions an exception 
vector address code is made available on the TPC[3:0] bus. 


(8) TPC[3:0] (Target PC) Output 


TPC[3:0] either indicates the type of the target PC address or the target address of 
indirect jump instructions or exception vector address codes. 


TPC[3:0] when TPCE* is asserted 


When TPCE* is asserted the type of the target PC address is made available on 
TPC[3:0]. Each bit of TPC[3:0] indicates a different type and multiple bits can be 
active at the same time. 

e TPC[O]: J ump Target during Phase A 


When this signal is asserted it indicates that the target instruction of an 
Indirect J ump instruction (includes JR, J ALR and ERET) is retired during 
Phase A. The target address is made available on TPC[3:0] in the next cycle if 
neither TPC[2] or TPC[3] are asserted simultaneously with this signal. 


e TPC[1]: Exception Target during PhaseA 
When this signal is asserted it indicates that the first instruction of an 
exception handler is retired during Phase A. The exception vector address is 


made available on TPC[3:0] in the next cycle if neither TPC[2] nor TPC[3] are 
asserted simultaneously with this signal. 


e TPC[2]: J ump Target during Phase B 
When this signal is asserted it indicates that the target instruction of an 


Indirect J ump instruction is retired during Phase B. The target address is 
made available on TPC[3:0] in the next cycle. 


e TPC[3]: Exception Target during Phase B 


When this signal is asserted it indicates that the first instruction of an 
exception handler is retired during Phase B. The exception vector address is 
made available on TPC[3:0] in the next cycle. 
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TPC[3:0] when TPCE* is deasserted 


When TPCE* is not asserted TPC[3:0] can be carrying the following three type of 
information: 


1. There is no meaningful information on TPC. This happens most of the time 
when the program is executing sequentially. 


2. The target address is made available because in the previous cycle TPCE* 
was asserted and TPC[0] or TPC[2] were equal to 0. The target address starts 
with the least significant four bits of the target instruction address (bits[5:2]). 


3. An exception vector address code is made available because in the previous 
cycle TPCE* was asserted and TPC[1] or TPC[3] were equal to 0. The 
exception vector address code are shown in Table 12-2. 


Table 12-2. Exception Vector Address Codes 


STATUS.BEV | STATUS.DEV | STATUS.EXL Vector 
Address TPC 0]) 


Reset, NMI X OxBFCO 0000 (1000) 
| TLB lis X 0 OxBFCO 0200 1100 | 


e ) 
pres iss [00 F000 000] 0 10000 
ripwiss | Si SS ——*d St _—fv0 380 | 98 TTT) 
a CCE ENCED 
atop S30 [x Jo fe or] 2 10 


Performance OxBFCO 0280 13 san 
Counter 
Performance 0x8000 0080 (0001) 
a 


se een Se 
Fcommon [1 |i = BO GO [iS TIT) | 
Feommen [0 [x | xf oxa00 oreo | 3 oor) | 
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12.1.3 Priority of Target Addresses 


The target address for an indirect jump instruction or an exception vector address code is 
made available on TPC[3:0]. For an indirect jump instruction it takes multiple cycles (8 
BUSCLK cycles or 16 CPU clock cycles) for the complete target address to be made 
available on the TPC[3:0] bus. As such multiple conditions can occur simultaneously and 
there are certain priorities associated with putting out the target address. The rules 
governing what is made available on the TPC[3:0] bus are listed below: 


1. If a new indirect jump instruction is retired while the target address PC for a 
previous indirect instruction is still being put out on TPC[3:0], the new indirect 
jump instruction’s target PC will be signaled and start coming out on the 
TPC[3:0] bus and the previous target PC output will be terminated. 


2. If an exception is taken while the target address PC for a previous indirect 
instruction is still being put out on TPC[3:0], the exception vector address code 
will be signaled and start coming out on the TPC[3:0] bus and the previous 
target PC output will be terminated 


The rules are also described in the following flowchart. 


New Indirect Jump 
or Exception 
Target Retired ? 


Exception Indirect Jump 


Previous Target 
Address. Is Being Output 
Currently ? 


Previous Target 
address is Being Output 
Currently ? 


Terminate Outputting 
Current PC Output 


Suspend Outputting 
Previous Target 
Address Output 


Output Exception Output Exception Start Outputting 
Target Target Target Address 
of Jump 


Resume Outputting 
Previous Target 
Address 


Figure 12-1. Priority of Outputting Jump or Exception Target 
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12.1.4 Examples of PC Tracing 


The following sections contains examples of program execution and the corresponding 
waveforms of the PC trace signals. Note that when two instructions are retired 
simultaneously, just for the sake of illustration, it is indicated which instruction is 


executed in which pipeline. In reality, in this case, it is not known which instruction is 
retired from which pipeline. 
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12.1.4.1 Sequential Execution 


This is an example of sequential program execution. The program fragment is as follows: 


mul 


The PC trace signals for the program fragment are shown below: 


Phase | A | B | A | B | A | B | A | B | 
CPUCLK 
BUSCLK 


reo | ma | ob | aww | - | ~ | a | 
poet | [ate Jw | | su | a | 


SPAR fe 
SPB fe 
TRCER ef NR 
tects] \\\\\AAAAAAAAAAAA;AAA AAA 


Figure 12-2. Waveform for Sequential Excecution 
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12.1.4.2 Conditional Branch 


This is an example of program with conditional branch instructions. Both the branch 
taken and not taken case is illustrated. The program fragment is as follows: 
add 


beq LO Not Taken 


beq L1 Taken 


lbne 12 Taken 


12: stb. 
The PC trace signals for the program fragment are shown below: 
moe fa fefalje|alelals|ale 


Taken 
Pipe 0 | add | add | add | : | = | add | bne | sub | 
Pipe 1 | - | beq | lw | - | beq | add | sll | sub | 
Not Taken Taken 
POEXEA* add add bne 
P1EXEA* lw beq sll 
POEXEB* add add sub 
P1EXEB* beq add sub 
JMPA* beq bne 


MPR NE 
toe 
Tecis0)_\\\\\\AAAAAAAAAAAAAAAAAAAAAAAAAAAAA 


Figure 12-3. Waveform for Conditional Branch 
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12.1.4.3 Indirect Jump (Target in Phase A) 


This is an example of program with an indirect jump instruction which is retired during 
phase B. The program fragment is as follows: 


add 
add 
jr Ll 
lw 


Ll: xOr 


sw 
sll 
sub 
sub 


The PC trace signals for the program fragment are shown below: 
mee fa fafateafafsajals|a|s | 


Pipe 0 | add | add | _ | = ee ori | sll | sub | 
mot = fe me = [ae | [me | | 
POEXEA* add xor sll 
PIEXEA* Iw add sw 
POEXEB* add or sub 
P1EXEB* ir ori sub 
JMPA# 
JMPB* ir 
TPCE* xor 

|} 9 Bus Cycles 


TA[x:y] = Target address bit x to y 


Figure 12-4. Waveform for Indirect Jump (Target in Phase A) 
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12.1.4.4 Indirect Jump (Target in Phase B) 


This is an example of program with an indirect jump instruction which is retired during 
phase A. The program fragment is as follows: 


add 
add 
jr Ll 
lw 


Ll: xOr 


sw 
sll 
sub 
sub 


The PC trace signals for the program fragment are shown below: 
moe fA fe fA feta fafa lela e | 


roo awe] |= |= | - ow fo [oe 
Pipe 1 | ir | lw | = | xor | add | ori | sw | sub | 

Target 
POEXEA* add sll 
PIEXEA* ir add sw 
POEXEB* or sub 
PIEXEB* Iw xor or sub 
JMPA* ir 
JMPB* 
TPCE* xor 

le 8 Bus Cycles 


Figure 12-5. Waveform for Indirect Jump (Target in Phase B) 


12-12 


TX 
TOSHIBA Chapter 12 PC Trace es” 


12.1.4.5 Indirect Jump (During Target PC Output) 


This is an example of a program with two indirect jump instructions. While the target 
address PC associated with the first indirect jump instruction is being put out the second 
indirect jump instruction is retired. Thus the first target PC output is terminated and the 
second target PC output is signaled and then made available. The program fragment is as 
follows: 


add 
add 
jr Ll 
lw 
Ll: xOr 
add 
jr L2 
add 
12 sw 
sll 
sub 
sub 


The PC trace signals for the program fragment are shown below: 
Phase ja |eafalflseftaltspfals|f]a|s|a|s | 


CPUCLK 
BUSCLK 


Target Target 
Pipe 0 | add | add | = | = | xor | jr | = | = | sll | sub | 
Pipe 1 | - | jr | lw | - | add | add | - | - | sw | sub | 


Figure 12-6. Waveform for Indirect Jump (During Target PC Output) 
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12.1.4.6 Exception (Target in Phase B) 


This is an example of a program which generates an exception. The target instruction 
(first instruction of the exception handler) retires in phase B. The program fragment is 
shown below. The label ExHnd identifies the first instruction of the exception handler. 


add. 

add. 

add. 

lw 

teq # Generates exception 


ExHnd: xor 
add 
sw 

sll 
sub 
sub 


The PC trace signals for the program fragment are shown below: 
More stall cycles might be inserted. 
<> | 
Phase }a]|sfaflsflalsfal|esla es | 


CPUCLK 
BUSCLK 


Exception 

Target 
Pipe 0 | add | add | = | = | = | xor | sll | sub | 
Pipe 1 | - | add | lw | 2 | = | add | sw | sub | 
POEXEA* add sll 
P1EXEA* lw sw 
POEXEB* \ add / \ xor sub / 
P1EXEB* add add sub 


JMPA* / \ 
JMPB* / \ 


TPCE* xor 


TmCeaT NAA AAA Eee) 


E.Code = Exception Vector Code 


Figure 12-7. Waveform for Exception (Target in Phase B) 
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12.1.4.7 Exception (During Target PC Output) 


This is an example of a program which generates an exception while a target PC from an 
earlier indirect jump instruction is being made available. The target PC output is 
terminated and the exception vector address code is signaled and then made available. 
The target instruction (first instruction of the exception handler) retires in phase B. The 
program fragment is shown below. The label ExHnd identifies the first instruction of the 
exception handler. 


add 

add. 

add. 

lw 

teq # Generates exception 


ExHnd: xor 
add 
sw 

sll 
sub 
sub 


The PC trace signals for the program fragment are shown below: 


More stall cycles might be inserted. 
Phase (Pe ct i= 2a (ee ssa a a <a cM: a 


Exception 

Target 
Pipe 0 | add | add | = | = | = | xor | sll | sub | 
Pipe 1 | = | add | lw | = | = | add | sw | sub | 
POEXEA* add sll 
P1EXEA* lw sw 
POEXEB* \ add / \ xor sub / 
P1EXEB* add add sub 


JMPA* / \ 
JMPB* / \ 


TPCE* xor 


TPC[3:0] <) TA13:10 TAI7:14 TA21:18 0111 


TAxx:yy = Target Address bit xx to yy 
E.Code = Exception Vector Code 


Figure 12-8. Waveform for Exception (During Target PC Output) 
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12.1.4.8 Exception Generated by Branch or Jump Instruction 


This is an example of a program in which an indirect jump instruction generates an 
exception. As such the program jumps to the exception handler and the only thing 
indicated is the exception vector address code and not the jump. The target instruction 
(first instruction of the exception handler) retires in phase B. The program fragment is 
shown below. The label E xHnd identifies the first instruction of the exception handler. 


add 

add 

add 

lw 

jr # Generates an exception 
nop # Branch delay slot 


ExHnd: xor 
add 
sw 

sll 
sub 
sub 


The PC trace signals for the program fragment are shown below: 
More stall cycles might be inserted. 
Phase | A | B | A | B | A | B | A | B | A | B | 


CPUCLK | 
BUSCLK | 


Exception 

Target 
Pipe 0 | add | add | = | = | = | xor | sll | sub | 
Pipe 1 | = | add | lw | = | = | add | sw | sub | 
POEXEA* add sll 
P1EXEA* lw sw 
POEXEB* \ add / \ xor sub / 
P1EXEB* add add sub 


JMPA* / \ 
JMPB* F \ 


TPCE* xor 


TPots:0] \A AAA Ott FX ECode_) 


E.Code = Exception Vector Code 


Figure 12-9. Waveform for Exception Generated by Branch or Jump Instruction 
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12.1.4.9 Exception Generated by Branch Delay Slot Instruction 


This is an example of a program in which the branch delay slot instruction generates an 
exception. As such the program jumps to the exception handler and the only thing 
indicated is the exception vector address code and not the jump. The target instruction 
(first instruction of the exception handler) retires in phase B. The program fragment is 
shown below. The label E xHnd identifies the first instruction of the exception handler. 


add. 
add. 
add. 
lw 
jr 
lw # Generates an exception 


ExHnd: xor 
add 
sw 
sll 
sub 
sub 


The PC trace signals for the program fragment are shown below: 
More stall cycles might be inserted. 
}+<———_——_>| 
Phase (fe came = a fees ea OO ea 


CPUCLK 
BUSCLK 


Exception 

Target 
Pipe 0 | add | add | jr | = | - | xor | sll | sub | 
Pipe 1 | - | add | lw | - | - | add | sw | sub | 
POEXEA* add jr sll 
P1EXEA* lw sw 
POEXEB* \ add / \ xor sub / 
P1EXEB* add add sub 


JMPA* ] \ jr / \ 
JMPB* / \ 


TPCE* xor 


Tee a) NAA et ee) 


E.Code = Exception Vector Code 


Figure 12-10. Waveform for Exception Generated by Branch Delay Slot Instruction 
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12.1.4.10 Exception Generated by Target Instruction 


This is an example of a program in which the target instruction of an indirect jump 
generates an exception. As such the program jumps to the exception handler and the only 
thing indicated is the exception vector address code and not the jump. The target 
instruction (first instruction of the exception handler) retires in phase B. The program 
fragment is shown below. The label E xHnd identifies the first instruction of the exception 
handler. 


add 

add 

add 

lw 

jr Ll 
nop 


Tilt lw # Generates an exception 
and 


ExHnd: xor 
add 
sw 

sll 
sub 
sub 


The PC trace signals for the program fragment are shown below: 


More stall cycles might be inserted. 


Phase |a]|eflal{sefa|lsal{als|latd|s a |e, | 
CPUCLK | 
BUSCLK | 


poco [aa | ats |e mm | = | | ~ fee | ot | a | 
poor | Jaw fm | | - f | fast | ow | | 


JMPA* / \ jr / \ 
JMPB* / \ 


TPCE* xor 


mers) SAAS ASAE AAA SE Coe) 


Figure 12-11. Waveform for Exception Generated by Target Instruction 
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12.1.4.11 Back to Back Exceptions (Case 1) 


This is an example of a program in which two back to back exceptions are generated. The 
program jumps to the first exception handler but then immediately jumps to the second 
exception handler. The target instruction (first instruction of the second exception 
handler) retires in phase A. The exception vector address code for the first handler is 
never made available. The program fragment is shown below. The label ExHnd1 identifies 
the first instruction of the first exception handler and the label ExHnd2 identifies the first 
instruction of the second exception handler. 


add. 

add # Generates the first exception 
ExHndl: xor # Generates the second exception 

xor 
ExHnd2: sw 

sll 

sub 

sub 


The PC trace signals for the program fragment are shown below: 


More stall cycles might be inserted. 
Phase |a|eflal{lsefa|lsal{als|lat|s a |e, | 


CPUCLK | 
seeds, «he I 


Exception 
Target 
Pipe 0 | add | | | | | | | | st | sub | 
Beet bee ety ee tees ace alse. el se ll sie. [Si | 


deere NNNNNANNANNNANUNAUAAUNANNUANUANUNANNNANNY OID 222 


E.Code = Exception Vector Code 


Figure 12-12. Waveform for Back to Back Exceptions (Case 1) 
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12.1.4.12 Back to Back Exceptions (Case Il) 


This is an example of a program in which two (all most) back to back exceptions are 
generated. The program jumps to the first exception handler and then generates an 
exception when executing the second instruction of the exception handler. It then jumps to 
the second exception handler. The target instruction (first instruction of the first exception 
handler) retires in phase A. As compared to the case discussed above the exception vector 
address code for the both the handlers are made available. The program fragment is 
shown below. The label ExHnd1 identifies the first instruction of the first exception 
handler and the label ExHnd2 identifies the first instruction of the second exception 
handler. 


add 
add # Generates the first exception 


ExHndl: xor 
xOr # Generates the second exception 


ExHnd2: sw 

sll 
sub 
sub 


The PC trace signals for the program fragment are shown below: 
More stall cycles might be inserted. 


Phase |}afejalsfalflseflalfs]als fades | 


CPUCLK 
susegee de JE ve = ae 


Exception Exception 
Target Target 
Pipe 0 | ad | - | - | - | xr | - | - | - | su | sup | 
Bipot Minee if eee. salt aul eae oe lea: [sib 


TPCE* xor sw 


TPCIONN AUS At kode 7k MON GE Cone 


E.Code = Exception Vector Code 


Figure 12-13. Waveform for Back to Back Exceptions (Case II) 
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13. Hardware Breakpoint 


This chapter describes hardware breakpoint functions for debugging present on the C790. 
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13.1 Hardware Breakpoint 


C790 provides hardware breakpoint mechanism for debugging purpose. (In this section, 
hardware breakpoint is sometimes referred to as “breakpoint”.) This function allows users 
to set a instruction breakpoint and a data address/value breakpoint with signaling the 
breakpoint event occurrence to external probe. The following summarizes the features of 
the breakpoint function. 


e Provides both instruction and data breakpointing in virtual address. 
e Instruction address breakpoint with address masking. 


e Data breakpoint with masking. Data breakpoint can be set by the following 
events: 


Address with masking 
Value with masking 
Read/write 
e Independent exception event control for instruction and data. 
e Individual event control by processor operating mode/exception level. 
e Provides a trigger signal to external probes synchronized with the breakpointing 
event. 


Hardware breakpointing is implemented as a part of Coprocessor 0. Configuring the 
breakpoint is done by setting 7 Breakpoint registers by special MTCO/MFCO instructions. 
Figure 13-1 shows the basic structure of the breakpoint hardware. 

Breakpoint can generate breakpoint exception which is categorized in Level2 exception, 
and has a dedicated exception vector. (See 5. Exception) This exception is only masked in 
Level2 mode, and exception generation itself can be controlled by the Breakpoint Control 
Register mentioned in the following section. Note that some of breakpoint exceptions are 
imprecise, for instance, setting value breakpoint for load instruction is basically imprecise 
because the load instruction may retire from the pipeline before actual acquisition of 
memory contents. The following summarizes imprecise cases: 


e All data value breakpoint on load instruction 
e Data value breakpoint on SWC1 instruction 


13.1.1 Hardware Breakpoint signal 


To signal a breakpoint occurrence, the C790 activates a signal called TRIG, whenever a 
trigger condition is met. 


e TRIG (Trigger Output) Output 


This signal is asserted for two BUSCLK cycles when a trigger condition is met. 
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fetch PC 
load/store address 
load/store value 


Address / Value sap 
Register DVB 


Trigger to 
external probe 
(TRIG*) 


Exception 


Pipeline Control 


Breakpoint (Exception Control) 


Event 


Figure 13-1. Overall Structure of Hardware Breakpoint 


13.2 Breakpoint Registers 


Hardware breakpoint is comprised of 3 pairs of breakpoint registers and one control 
register listed below. Each of breakpoint register pair includes one breakpoint value 
register and one breakpoint mask register. 


e Breakpoint Control Register (BPC) 
e Instruction Address Breakpoint Registers 
Instruction Address Breakpoint Register (|AB) 
Instruction Address Breakpoint Mask Register (IABM) 
e Data Address Breakpoint Registers 
Data Address Breakpoint Register (DAB) 
Data Address Breakpoint Mask Register (DABM) 
e Data Value Breakpoint Registers 
Data Value Breakpoint Register (DVB) 
Data Value Breakpoint Mask Register (DVBM) 
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All 7 registers are 32-bit read/write and assigned to Coprocessor0 register 24. Therefore, 
C790 provides extended MTCO instructions for accessing these registers and it is 
necessary to use these instructions to access these registers instead of the conventional 
MTCO/MFCO instructions. Table 13-1 and Table 13-2 summarizes the instructions for 
accessing the registers. 


Table 13-1. Set anew value into breakpoint registers 


| Mnemonic | Operation 
MTBPC Move to Breakpoint Control Register 


MTDABM Move to Data Address Breakpoint Mask Register 


Table 13-2. Get the value from breakpoint registers 


| Mnemonic | (Operation 


13.2.1 Breakpoint Control Register (BPC) 


The BPC register contains enable bits and status bits for controling the breakpointing of 
both instruction and data. This register consists of 5 parts of bit fields: 


e Breakpoint overall control (bit [31:28]) 
These bits controls the operation mode of the breakpointing. 

e Instruction breakpoint control (bit [26:23]) 
These bits specifies the processor mode that the instruction breakpoint is 
enabled. 

e Data breakpoint control (bit[21:18]) 
These bits specifies the processor mode that the data breakpoint is enabled. 

e Signaling Control (bit[17:15]) 
These bits controls the occurrence of breakpoint exception / trigger generation 
upon the breakpoint event. 

e Breakpoint Status (bit[2:0]) 
These bits indicates the type of breakpoint event. This part is used to identify 
which breakpoint event occurred in the breakpoint exception handler. 
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The following shows the detailed bitmap of BPC register. 


313029 28 27 26 25 24 23 22 21 20191817161514131211109 8 76543 210 


1 }D)/D|D Ty ty ty D/D/D|D/}1|D|B D/D} 1 
AJIRWIVIOJUIS|K/X/OJUJS|K|X/T/T/E]0}0/0/0)/0)/0)0/0/0/0/0)0 WIRIJA 
EJEJE|E EJEJE/E} |E/E/E;E;E;E|D B/B/B 


Table 13-3 describes the BPC register fields. 
Table 13-3. BPC Register Fields 


Instruction Address Enable. This bit enables/disables instruction Read / 
address breakpointing. Write 
0: disable instruction address breakpointing 

1: enable instruction address breakpointing 

Data Read Enable. This bit enables data load address breakpointing. Read / 
0: disable breakpointing on reads Write 
1: enable breakpointing on reads 

Data Write Enable. This bit enables data store address breakpointing. Read / 
0: disable breakpointing on writes Write 
1: enable breakpointing on writes 


Data Value Enable. This bit is valid only when DRE and/or DWE are Read / Undefined 
set to 1. When DVE is set to 1 data read breakpoints (DRE == 1) are Write 

further qualified by the value of the data read, and data write 

breakpoints (DWE == 1) are further qualified by the value of the data 

written. Note that data value breakpoints for data reads are 

imprecise. See section ne: 1 (“Hardware Breakpoint”) for more details. 


Instruction break - User Enable. This bit enables instruction address Read / Undefined 
breakpointing in (standard) user mode. This bit is only valid if IAE is Write 

set to 1. 

0: disable instruction address breakpointing in User mode 

1: enable instruction address breakpointing in User mode 

Instruction break - Supervisor Enable. This bit enables instruction Read / Undefined 
address breakpointing in supervisor mode. This bit is only valid if [AE Write 

is set to 1. 

0: disable instruction address breakpointing in Supervisor mode 

1: enable instruction address breakpointing in Supervisor mode 

Instruction break - Kernel Enable. This bit enables instruction address Read / Undefined 
breakpointing in non-exception kernel mode - i.e. when both Write 

STATUS.EXL and STATUS.ERL are 0. This bit is only valid if IAE is 

0: disable instruction address breakpointing in Kernel mode 

1: enable instruction address breakpointing in Kernel mode 

Instruction break - EXL mode Enable. This bit enables instruction Read / Undefined 
address breakpointing in exception kernel mode - i.e. when Write 

STATUS.EXL is 1 and STATUS.ERL is 0. This bit is only valid if IAE 

is set to 1. 

0: disable instruction address breakpointing in EXL mode 

1: enable instruction address breakpointing in EXL mode 


rsvd 22 Reserved - must be written as zeros by software. The processor Read 
returns zeros in these bit positions when read. 
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P yP Value 
DUE 21 Data break - User Enable. This bit enables data breakpointing in User Read / Undefined 
mode. This bit is only valid if DWE or DRE is set to 1. Write 
0: disable data breakpointing in User mode 
1: enable data breakpointing in User mode 
DSE 20 Data break - Supervisor Enable. This bit enables data breakpointing in | Read/ Undefined 
Supervisor mode. This bit is only valid if DWE or DRE is set to 1. Write 
0: disable data breakpointing in Supervisor mode 
1: enable data breakpointing in Supervisor mode 
DKE 19 Data break - Kernel Enable. This bit enables data breakpointing in Read / Undefined 
Kernel mode - i.e. when both STATUS.EXL and STATUS.ERL are 0. Write 
This bit is only valid if DWE or DRE is set to 1. 
0: disable data breakpointing in Kernel mode 
1: enable data breakpointing in Kerne/ mode 
DXE 18 Data break - EXL mode Enable. This bit enables data breakpointing in Read / Undefined 
Exception Kernel mode - i.e. when STATUS.EXL is 1 and Write 
STATUS.ERL is 0. This bit is only valid if at least one of DRE or DWE 
are set to 1. 
0: disable data breakpointing in EXL mode 
1: enable data breakpointing in EXL mode 
ITE 17 Instruction Trigger Enable. This bit enables the generation of the Read / Undefined 
trigger signal when an instruction breakpoint occurs. Write 
0: disable instruction breakpoint trigger 
1: enable instruction breakpoint trigger 
DTE 16 Data Trigger Enable. This bit enables the generation of the trigger Read / Undefined 
signal when an data breakpoint occurs. Write 
0: disable data breakpoint trigger 
1: enable data breakpoint trigger 
BED 15 Breakpoint Exception Disable. This bit disables the entry into the Read / Undefined 
debug exception handler. Note that the setting of this bit does not Write 
affect trigger signal generation. 
0: enable entry into debug exception handler 
1: disable entry into debug exception handler 
rsvd 14-3 | Reserved - must be written as zeros by software. The processor 
returns zeros in these bit positions when read. 
DWB 2 Data Write Breakpoint. This status bit indicates whether a data Read / Undefined 
breakpoint has occurred on a write or not. Write 
0: no data breakpoint has occurred on a write 
1: data breakpoint has occurred on a write 
DRB 1 Data Read Breakpoint. This status bit indicates whether a data Read / Undefined 
breakpoint has occurred on a read or not. Write 
0: no data breakpoint has occurred on a read 
1: data breakpoint has occurred on a read 
IAB Instruction Address Breakpoint. This status bit indicates whether an Read / Undefined 
instruction address breakpoint has occurred or not. Write 
0: no instruction address breakpoint has occurred on a read 
1: instruction address breakpoint has occurred on a read 
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13.2.2 Instruction Address Breakpoint Register (IAB) / Instruction 
Address Breakpoint Mask Register (IABM) 


31 210 
IAB 0 

Figure 13-2. Instruction Address Breakpoint Register 
31 210 


IABM jo 


Figure 13-3. Instruction Address Breakpoint Mask Register 


This register pair holds the instruction breakpointing address. Both the value in IAB 
register and the current fetch PC are masked by the value in I|ABM. If the values are 
equal, condition for instruction address breakpoint becomes true. As fetch PC is always 
word-aligned, the bit 0 and bit 1 of these registers are fixed to zeros. 


13.2.3 Data Address Breakpoint Register (DAB) / 
Data Address Breakpoint Mask Register (DABM) 


This register pair holds the data breakpointing address. Both the value in DAB register 
and the destination for load/store operation are masked by the value in DABM. If the 
values are equal, condition for data address breakpoint becomes true. These registers are 
32-bit wide readable/writable. 


[e) 


{ 


(jo) 


DAB 


Figure 13-4. Data Address Breakpoint Register 


ao 


1 


DABM 


Figure 13-5. Data Address Breakpoint Mask Register 
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13.2.4 Data Value Breakpoint Register (DVB) / 
Data Value Breakpoint Mask Register (DVBM) 


This register pair holds the value for data value breakpointing. Both the value in DVB and 
the lower 32 bits of load/store data are masked with the value in DVBM. If the values are 
equal, condition for data value breakpoint becomes true. Note that enabling data value 
breakpoint implies activating the data address breakpointing (setting either/both of 
DRE/DWE bit in BPC), and therefore breakpoint event for data value only happens if both 
condition for data address breakpoint and data value breakpoint becomes true. 

Note that the comparison of data value is always performed in 32bit regardless of the 
width of load/store operation: the store value comes from GPR is truncated to 32bit value 
for comparison and the load value is appropriately signextended or merged with the 
contents of GPR (unaligned cases) and then the least significant 32-bits are used for 
comparison. For instance, most significant (64432) bits/32-bits are truncated on data value 
comparison for LQ/SQ/LD/SD instructions, while the value from memory is sign-extended 
to comprise a 32bit value for LB/LH instructions. 


13.3 Setting Breakpoint 
The following sections mention the details of breakpoint controls with some sample codes. 


As C790 is a pipelined superscalar processor, several restrictions are applied in setting 
breakpoint registers. The following is the main topic that has to be taken care of: 


31 


(o>) 


DVB 


Figure 13-6. Data Value Breakpoint Register 


31 


DVBM 


Figure 13-7. Data Value Breakpoint Mask Register 


(o>) 


e Upon chainging the configuration of breakpointing, it is very likely that 3 or 
more registers must be updated. However, the change is performed in pipelined 
manner as C790 is pipelined processor. This potentially has possibility to create 
a hazardous area in generating exception unconsciously. 


e C790 does NOT wait for the data arrival on load operation. The instruction itself 
may retire from the pipeline before storing the data into the registers, and the 
occurrence of breakpointing event delays from the instruction completion. This 
not only make some data value breakpoints imprecise, but also temporally 
masks an occurrence of breakpointing event as following case: a data load 
instruction that should cause data value breakpoint exception results in cache 
miss. But in the next cycle, other level2 exception such as SIO interrupt had 
been detected and the processor entered level2 before the acquisition of the data. 
Under this scenario, data value exception will be delayed until the processor 
returns from Level2 mode. 
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13.3.1 Sequence of Setting Breakpoint 


In order to prevent spurious exception during reconfiguring the breakpoint, managing 
breakpointing enable before and after the change is mandatory. One easy way is to change 
the processor mode into Level2 to mask breakpoint exception unconditionally, but, this 
has an side effect that the user segment becomes unmapped. Therefore, this section 
mainly focuses on changing the configuration without changing the processor mode. 

The following summarizes the sequence of changing breakpointing configuration. 

. Synchronize the pipeline 

. Disable the breakpoint exception that is going to be reconfigured 

. Synchronize the pipeline 

. Set appropriate data in Breakpoint register pairs 


. Set appropriate configuration into Breakpoint Control Register, including enabling 
the break point exception. 


6. Synchronize the pipeline 


OB WN 


There are three synchronization points in the sequence: the first one is to ensure that 
there is no pending breakpoint exception for consistency in the breakpoint exception 
handler. The second one is right after disabling the breakpoint that is going to be 
reconfigured. This separates the change in the control register from the change for other 
breakpoint register so that programmer can safely change the breakpoint. The third 
synchronization is after updating breakpoint control register. Since C790 issues the 
instructions in in-ordered manner, changes for breakpoint register pair always precedes 
the change in the control register. In this sense, there is no spurious exception without 
this synchronization. However, in order to catch the breakpointing event right after 
updating the control register, flushing the pipeline at this point is strongly recommended. 

The first synchronized operation must be either of SYNC.P or SYNC.L operation 
depending on the breakpoint that is going to be reconfigured. If it is instruction 
breakpoint, SYNC.P is to be used and otherwise SYNC.L is to be used. For second and 
third synchronization, SYNC.P is to be used. 

The flow generating TRIG* and exception is shown in Figure 13-8, Figure 13-9, Figure 
13-10. Figure 13-8 describes the flow hardware breakpoint encounts the breakpointing 
event. Figure 13-9, and Figure 13-10 describe the flow how the exception and TRIG* 
signal is asserted. 

The following shows some simple sample codes for configuring breakpoint registers. 
Several programming notes/issues are put in the comments. 
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Figure 13-8. Hardware Breakpoint detection flow (Setting) 
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Figure 13-9. Hardware Breakpoint detection flow (IAB) 
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Figure 13-10. Hardware Breakpoint detection flow (DAB/DVB) (1/2) 
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Figure 13-10. Hardware Breakpoint detection flow (IAB) (2/2) 


13-13 


TX 
TOSHIBA Chapter 13 Hardware Breakpoint We ” 


13.3.2 Instruction Breakpointing 


The following code sets an instruction breakpoint from 0x1234 5600 to 0x1234_5é6ff, and 
traps if the processor is either in user mode or in supervisor mode. 


# 

# Setting Instruction address breakpoint from 0x1234_5600 to 0x1234_56ff 
# in user mode and supervisor mode 
# 
# 


lst sync. 
sync.p # A barrier to ensure there is no pending 
# instruction address breakpoint in pipe. 
# pipeline flusing works for this purpose. 


# At first, disable instruction breakpointing to avoid spurious exceptions. 


# The following uses conservative way not to break the configuration for 
# data breakpointing. 
# 
mfbpc $4 get the value in BPC 
bgez $4, 1f£ skip following if ( BPC[31] == 0 ) 
nop (bds) 
li $5, (1 << 31) IAE is in 31st bit of BPC 
xor $4, $5, S4 Resetting IAE bit to zero. 
mtbpc $4 reload BPC. 
2nd sync. 
sync.p barrier to ensure the configuration change 
of breakpoint function 
1: 
Reconfigure instruction breakpoint address. 
Note that least significant 8 bits can be anything because it is masked 
by IABM register anyway 
li $4, 0x12345678 
mtiab $4 
# 


# Setting mask register. Masked if corresponding bit in mask register 
is reset to zero. 


li $5, OxfffffFfO0 
mtiabm $5 


Reconfigure instruction breakpoint. For better understanding, once 
resetting all the bits for instructio breakpoint, and then sets new 
config. 


mfbpc $4 


Reset IUE/ISE/IKE/ITE/IAB. Especially resetting IAB is important to 
know the cause of next breakpoint exception correctly. 


li $5, ~( \ 
( 1 << 26 ) # IUE \ 
(1 << 25 ) # ISE \ 
( 1 << 24 ) # IKE \ 
( 1 << 23 ) # IXE \ 
(A <eet7) # ITE \ 
(1 << 0) # IAB \ 


) 
and $4, $4, $5 


# 

# Set new configuration to BPC register. 

# Note that setting BPC after IAB/IABM is so important to avoid spurious 
# exception. 

# 
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IAE 1 to enable Inst. B.P. 

IUE = 1 to enable Inst. B.P in user mode. 
IUE 1 to enable Inst. B.P in supv. mode. 
BED = 1 to enable generating exception. 
Barrier to ensure the configuration change 
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13.3.3 Data Address Breakpointing 


The following code sets a data address breakpoint from 0x1230_0000 to 0x1233 ffff for 
both reading and writing, and traps if the processor is either in kernel mode(including 
under level 1). 


# 

# Setting data address breakpoint from 0x1230_0000 to 0x1233_ffff 
# in kernel (normal,L1) mode 
# 
# 


lst sync. 
sync.l # A barrier to ensure there is no pending 
# data address breakpoint in pipe. 
# Must flush all buffers for load/store for this 
# purpose by SYNC.L 


# 
# At first, reset data-breakpoint related bits to zeros. 
# Resetting DWB/DRB is important so that the hander can recognize the 


# next breakpoint exception correctly. 
# 
mfbpc $4 # load current configuration 
li $5, ~( \ 
( 1 << 30 ) # DRE \ 
( 1 << 29 ) # DWE \ 
( 1 << 28 ) # DVE \ 
( 1 << 21 ) # DUE \ 
( 1 << 20 ) # DSE \ 
(1 << 19 ) # DKE \ 
(1 << 18 ) # DXE \ 
( 1 << 16 ) # DTE \ 
(1 << 2.) # DWB \ 
(1<< 1) # DRB \ 
) 
and $4, $4, $5 
mtbpc $4 # reload BPC. 
2nd sync. 
sync.p # barrier to ensure the configuration change 


# of breakpoint function 


Reconfigure data breakpoint address. 
Note that least significant 18 bits can be anything because it is masked 
by DABM register anyway 


li $6, 0x12305678 
mtdab $6 
# 


# Setting mask register. Masked if corresponding bit in mask register 
is reset to zero. 


Da $5, Oxfffc0000 
mtdabm $5 


Set new configuration to BPC register. 
Note that setting BPC after DAB/DABM is so important to avoid spurious 


exception. 
1: $6, $6, \ 
( \ 

( 1 << 30 ) DRE = 1 to enable Data B.P on read \ 
( 1 << 29 ) DWE 1 to enable Data B.P on write \ 
(1 << 19 ) DKE = 1 to enable Data B.P in kern. mode. \ 
( 1 << 18 ) DXE = 1 to enable Data B.P under Ll. \ 
(1 << 15) BED 1 to enable generating exception. \ 
) 

or $5, $4, $6 Note that $4 still holds the value used 

on MTBPC. 
mtbpc $5 
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# 3rd sync. 
sync.p # Barrier to ensure the configuration change 
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13.3.4 Breakpointing by Data Address and Value 


Setting Data Address and Value breakpoint is the same as Data Address breakpoint. The 
following example is the same as the previous example except in that the trap only 
happens if the data contains OxCAFE in least significant 16 bits, and traps only on loading 
data. 


# 

# Setting data address/value breakpoint from 0x1230_0000 to 0x1233_ffff 

# with data that contains OxCAFE in kernel(normal, L1) mode. 

# 

# 1st sync. 

sync.l A barrier to ensure there is no pending 
data address breakpoint in pipe. 
Must flush all buffers for load/store for this 
purpose by SYNC.L 

# 


# At first, reset data-breakpoint related bits to zeros. 
# Resetting DWB/DRB is important so that the hander can recognize the 


# next breakpoint exception correctly. 
# 
mfbpc $4 # load current configuration 
1s $5, ~( \ 
( 1 << 30 ) DRE \ 
( 1 << 29 ) DWE \ 
( 1 << 28 ) DVE \ 
( 1 << 21) DUE \ 
( 1 << 20 ) DSE \ 
( 1 << 19) DKE \ 
( 1 << 18) DXE \ 
( 1 << 16) DTE \ 
(1<< 2) DWB \ 
(ih << 4 DRB \ 
) 
and $4, $4, $5 
mtbpc $4 # reload BPC. 
2nd sync. 
sync.p # barrier to ensure the configuration change 


# of breakpoint function 
# 
Reconfigure data breakpoint address. 

Note that least significant 18 bits can be anything because it is masked 
by DABM register anyway 


li $6, Oxl233ffff 
mtdab $6 


Setting mask register. Masked if corresponding bit in mask register 
# is reset to zero. 


li $5, Oxfffc0000 
mtdabm $5 


Configure data value address. 
Note that least significant 8 bits can be anything because it is masked 
# by DVBM register anyway 


li $6, Oxbabecafe 
mtdvb $6 


Setting mask register. Masked if corresponding bit in mask register 
is reset to zero. 


li $5, OxO0000fffE 
mtdvbm $5 
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# 

# Set new configuration to BPC register. 

# Note that setting BPC after DAB/DABM is so important to avoid spurious 
# exception. 


# 
ene $6, \ 
( \ 

( 1 << 30) DRE = 1 to enable Data B.P on read \ 
( 1 << 28 ) DVE = 1 to enable Data value B.P \ 
( 1 << 19 ) DKE = 1 to enable Data B.P in kern. mode. \ 
( 1 << 18 ) DXE = 1 to enable Data B.P under Ll. \ 
(1 << 15) BED = 1 to enable generating exception. \ 
) 

or $5, $4, S6 Note that $4 still holds the value used 

on MTBPC. 

mtbpc $5 

# 3rd sync. 

sync.p Barrier to ensure the configuration change 


13.3.5 Data Value Breakpointing 


Data value breakpoint can be configured so that it traps only by data value, by setting 
zero to DABM register and configuring the data breakpoint to “Data Address and Value” 
mode. 
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13.4 Triggering External Probes 


There is one dedicated pad to make breakpoint visible outside of C790. This pad, TRIG* 
signal, is asserted for two cycles whenever break point event is detected. This trigger 
signal generation is enabled by setting ITE/DTE bit in BPC register to 1. Note that 
assertion of TRIG* signal is not completely synchronized with the occurrence of exception: 
TRIG signal is directly connected to the internal breakpoint detect logic while exception 
induding breakpoint always occurs along with retirement of instruction. Threfore, 
thiming of the assertion of TRIG* signal and that of occurrence of exception may differs. 
Especially, if the breakpoint is detected right before entering Level2 mode, and if the 
breakpoint exception is taken imprecisely, exception may be masked because of processor's 
mode change although TRIG* signal has already been asserted. 


13.5 Important notice on using hardware breakpoint 


One important issue not mentioned in this section is that breakpointing does not take care 
of ASID on detecting breakpoint. This implies not only that software has to take care of it 
on context switching to apply breakpointing for a specific process, but also that imprecise 
breakpoint exception may be detected after or in the middle of context switching. In such 
condition, it may become difficult to identify which process the breakpoint exception 
belongs to. This can be avoided by executing SYNC.L instruction right before changing 
ASID. (Since all imprecise breakpoint events relates to load/store instructions, executing 
SYNC.L works as a barrier) 

Relating to this issue, as briefly described in section 13.3, issuing breakpoint exception 
may delay because of other level2 exception handling, although the breakpoint exception 
is actual precedent from instruction ordering point of view. In such condition, because 
C790 generates breakpoint exception after the processor returns from Level2,1 there is no 
possibility to miss encounting the breakpoint. However, if the program need to insure the 
order of occurrence between level2 exceptions, software has to take care of it (i.e. all level2 
handler has to check the occurrence of breakpointing first). Similarly, if a level2 exception 
DOES NOT return to where the exception was detected, software has to insure to reset 
the condition of breakpoint. 


' C790 tracks the occurrence of breakpoint exception until the breakpoint exception is taken. 
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5-9, 5-10, 5-11, 5-13, 9-1, 9-2, 9-3, 9-4, 9-5, 9-6, 9-10, 9-11, 12-6, A-4, C-25, C-26, C-35 
IGRGOND cise conics eet Noe ng ative eatiige ier et ccecune teh Hehe Hee thee Leaded sha tocsnd tees gh dla eteeeateee slope ea ade apd ties ieatbten eee iene as A-3 
GPGONDO) 2... ta siasttt tye ir eee ace as oa ee aks Ur At eed ees eae? 8-10, 8-11, C-2, C-3, C-4, C-5 
CPR ici Gaines A-3, C-17, C-18, C-19, C-20, C-21, C-22, C-23, C-24, C-25, C-26, C-27, C-28, C-29, C-30, 
C-31, C-32, C-33, C-34, C-35, C-36 
GPUADDRi. A.adncaceciinin dander iin ah damnable ela eles 8-3, 8-7, 8-9 
CRUAS TARY: titra: s.25¢ teste Setestesggetaxte ante: teyhecesseSedeartrasctexte sectee teutytea es oeeeds 8-3, 8-7, 8-8, 8-9, 8-12, 8-13, 8-16, 8-19 
CPUBE treet eaitig hive Latina ee it ele ee 8-3, 8-7, 8-9 
GPUGEK aesshetinctdegeeehtaeadeh ace Sarda chien alltel eel eee ee ate el a eee 8-11 
CPU AT Asst itt s iil itis tele Seer 8 a eo eo Seas ee es ne hk 8-3, 8-7, 8-9, 8-17, 8-20 
GCPUDSTARM ivsxtiast stair itee aa ateeet hanee 8-3, 8-10, 8-12, 8-13, 8-16, 8-17, 8-19, 8-20, 8-26, 8-28 
CPURD wtutaia ian lorie a ee ee ee ed se eae ee 8-3, 8-8, 8-9 
GPUTRANS TY PEvsteaiac aiaiteti ing ates a ee a Sh Stes te eat 8-8 
GPUTSIZE wiscrisat saps ce least tad eAsdioaieaneativa deste: Sas ea dea ead Acces ete 8-3, 8-9, 8-12, 8-13, 8-16, 8-19 
BOW ieee oe toate onze ans tenet rane te tsa! cotennsmac aeevensaeet cal gatavsa toga cts sige ch raah! deat talgateass Sede h-fagtetey sdashanenida, tai32: 8-3, 8-8, 8-9 
CT Gadi etic epee ae a ee ee 3-21, 10-7, 10-8, 10-9, 10-13, 11-9, D-15, D-40 
CTE Soctebestdavtrtvel aesenteeteinetth Beengvnisiw aay idiebrle ed ieviveeavttetaln aden 4-28, 4-29, 5-11, 9-2, 9-4, 9-5, 9-10, 9-11 
On 0 Bene teree neeerecee creer mer sence ec or perr cert eeerte etre cnet error ee epee ert crec peeerrse eer reer eer ree rere 4-29, 9-10, 9-11 
LO mu eaeeraeri et eecrceeree rene Ceere Cree eer CREE er eee CERES Cer ce eeee nore cere rae ere enercecn reece 4-29, 9-10, 9-11 
CU 2.8 atheist aed ed ad eal 1-5, 3-5, 3-20, 3-21, 4-16, 4-17, C-1, C-14, C-15 
CUO mwah cities duet eth elect oe i tei te a tee he een cal 5-23, C-7 
LA al Di eee eee or re Perera ee reer reer or ry Teer erere berry etre ir eeree peer reer Meter cye rere verre peer Terre rere ee eer eee ert a ercretr repre 3-26 
O14 1B eerer peer ierncepe: terme inc rorpecrrreerrrerceree rere cerrrerietrccerer crite her coccerceremecete eocceper eeeeeree ceereeecre cere ate reece rece D-16 
CVT DitmMtssackdis histatins oe eee edt te ee 3-21, 10-14, D-41 
GV Ty etches iegeteeieed vei lobehad pe cotedeleeed ele aecehaeeettene eel senda tee ee oedema tees tenets D-17 
GV Di Petititisnstvicrt stati ees tie eae Nips eee Shot ee ee a aie 3-21, 10-14, D-41 
A IPS epee ene eee eer eee RE EERE RE opie PER RED EEE PEE Ere Pee rceee cree cree ere creer recta Corer ere D-18 
CNT SIM ties iia ieee is nae ee Lae ed ee 3-21, 10-14, D-41 
CVT WAM ta tietieten aia aii attat in a es lees Gn as ated oat ae, tae ayn 3-21, 10-14, D-41 
GN TW Soh arse eet a eet va teet ac deena a Db ested asia hapeahtpests sascha dae tes ade oachdou tee utests aah da, Same ivans ade Sareae oR Reaesee D-19 
D 
DA cere a a ea 2 tt sega ag ee eet cage a wee SA nes Meare cee geste 4-27, 13-3, 13-7, 13-12, 13-16, 13-19 
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DABM tsetiusd yorici yea ven nathevb aha aii aie ageb ae ay une nate gee ee 4-27, 13-3, 13-7, 13-16, 13-18, 13-19 
DAD Dice cites shetty Ata ee his ee te tn tet lait oh 3-15, 5-26, A-34, A-141 
YB) D) Beker renee cerer terre arte ieee ior se creer ee rereeerrece terrence ce reree 3-14, 5-26, A-35, A-141, B-163, C-41, D-40 
DADDIU! ccicnis iin Laie ea ae Aa ean ees 3-14, A-35, A-36, A-141, B-163, C-41, D-40 
DAD DU tac stentitenth iin ei ee A AIA ens ats tooth Ante 3-15, A-34, A-37, A-141 
DBE Sie cvetsck cb olicdectsensteaddetiescia st haadantpetaceaetes cada, tAaatentectaansPradeuutis lade sebcgeaneeste tise vaaseatnceeersastaeaiee 4-20, 5-8, 5-19 
DORR a rrre pee reearecte ce rare cet Ea eres cece RCE enc iererteeeeerr eee errereerie renee reece eeeereeeeeerce renee 4-23 
DGE sieht eienianidnel Rania iehianitihigeilie ae 4-23, 5-11, 9-7, C-9, C-28 
DDN itsteciitendelcebivtuete bres Seer hiveaveds vote hte aaeatt ivedtivbelea i eeatibeadted 3-4, 3-14, A-142, B-165, C-42, D-41 
PDI Wie ceorecteasaadeg severes nee die pte eevee aided tay eve ee, panes eeiatenear at lenegenieeeie 3-4, 3-14, A-142, B-165, C-42, D-41 
fo [slo Ul eepeeepeee eeereeeceeerePree errieneerreeretre i merrerreeerrer i eeereeeer corre 3-20, 4-17, 4-18, 4-19, 4-26, 4-33, 5-10, 5-14, 13-6 
DEBUG ie wine iif alain eid id ee ee ee el eS 5-14 
DEG foster ets attach i At Gate a Ao ts an MA i titan fe a a eee A ee ate os bc ad 3-6 
COCOUPIING idk saree an ese eT A a a een en en ees 2-4 
/BT=Yh a) UI i) olf =>:<clo bpernecrterereree raercrerperrtrerceey trreerererrreepererrcecre rere rercererccerr rep creenceccereerpeetrtcrceerececcerrPerc rer rrceect 2-18, 8-2 
DEV asi it nai ee ete 4-16, 4-17, 5-7, 5-13, 5-14, 5-25, 9-10, 12-6 
PAIN este eet eal ce lhetepecits ted bebua sth cave Pha vebeteta reseed be celeapectcebeantt aed alee taveees eed tater ed eeilebieviea isi vnepehe C-6 
DIIW'BIIN secs tetvcees het fe thd fetish tees Nene deh ho ied Meat Mee tahoe Med yah Mette ayes Stated C-6 
Be SLO) |S Becceee pre ree erece orem reece ee reer ore etree cer cee rer rer eRe cere erence cere oreo cree errr ree Cree nee ceeereererr C-6 
Pie ecases eevee genes vay iach gone eee veleee La ee aie ee 3-20, 4-16, 4-17, 5-23, C-1, C-14, C-15, C-42 
DD) | Serpent eee eeneceecerey cerca y cerecee reo Geer ciree rece eer ieee ree coe creer eee ee ror eee tee ee rere erecta Peers 4-23, 4-24, 5-11 
CUA. : iach fae dace cane tvs deeehes bese oe nreateciny seeds a Aned anit aneatevendeeirsaa da eeaee 4-8, 5-18, 6-16, 8-12, A-91, C-11, C-12 
UPL 3 2o5. saise deat ped gecets Sedeax staset ras biteek cadunde ks tonest cca eitedt aslagtansceeecst aovedeeaatgatis 4-8, 4-32, 5-11, 6-16, C-11, C-12, C-13 
dispatches. .:..f0s cyan etal na dee litt eee sade ened neds neil apni nieve ape 3-17 
CISPIACO MOM ive icecscscceetetees te Beeeetataeeeieetaeheeabaly eevee eretiadinexdeebebeccat tl eapeeebeaveien iy ctedeebeaesedl Gee anabivertebadeeoelne 3-3, A-9 
Mire cbsg a tttateyeece eg deve ncmed eccnatethde 4 tne ada Sach ale Meee ee, tent 2-18, 3-16, 3-26, A-38, A-40, A-80, A-141, D-20 
DB) Ada) eoneerrecee cnet cence error areerrcr rece enarre er enero na eeeereEreeecer eee ee eecerer etree rere rereerer 3-21, 10-14, D-41 
DU Midece. cic Bit ended ei iMiaaee dail atin Me ain va le tie ee 2-14, 3-23, 3-26, 4-2, B-3, B-7, B-9, B-163 
DIVIGCweat aia aegis cin at 1-1, 2-6, 3-14, 3-16, 3-21, 3-22, 3-23, 3-24, 3-26, 4-1, B-3, B-5, B-8 
BAYA 0 eer eco Cereer Rc rere Prec eeree or ene Pere er pee reer reper ert cre orreereeer eer per etree eect er er yer torrets T rere 3-16, 3-26, A-40, A-141 
DY AYA Uh Reece peeeeereerece rere ice crcerccrreercccocerrcee err trerrer ere crecereerrereececrceceeer eres 2-14, 3-23, 3-26, 4-2, B-3, B-9, B-163 
DE: cielivne iia ativan ae a ete head ane 13-6, 13-16, 13-18, 13-19 
DIMA i cetecel ht certcvieeey tetrad ghe nets dea alee 8-1, 8-3, 8-6, 8-7, 8-10, 8-12, 8-13, 8-14, 8-25, 8-26 
DMAG i ktciuc hens hie eae Nets ened ees ite ees 8-1, 8-3, 8-10, 8-11, 8-13, 8-14, 8-25, 8-26 
DB) [C Ienseerceerroeeeecee re Pere neeer reer acer eireee error erree rer eriereeereel cereeereereoree te eereceeeeeeerreee 3-21, 10-13, D-21, D-40 
DWT. C1 veg ions elven Laie ae dn 3-21, 10-13, D-22, D-40 
DMULT. vita cuihitekieti tata ent Adair aia ees ah lons A tna eae 3-4, 3-14, A-142, B-165, C-42, D-41 
DMWLT Wiis. ccs 8 Accdtestit ie ene SA dando eaten hbaidane 3-4, 3-14, A-142, B-165, C-42, D-41 
doubleword .......... 3-5, 3-8, 3-9, 5-15, A-4, A-5, A-6, A-34, A-37, A-41, A-42, A-43, A-44, A-45, A-46, A-47, 


A-48, A-49, A-50, A-51, A-58, A-59, A-60, A-63, A-64, A-72, A-94, A-95, A-96, A-99, A-100, 
X-5 
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A-118, A-122, B-2, B-64, B-65, B-72, B-74, B-78, B-79, B-80, B-81, B-82, B-83, B-89, B-93, 
B-95, B-113, B-120, B-122, B-128, B-129, B-130 


a ree ne ree EeeE ECR ERE Der EE Pr ce TO ea etree TEC een ee er ecrcerin corte cence een eee terre crrcercerreree 13-6, 13-16, 13-18 
DRE i.e aint eens ae i ea ig ee 5-11, 13-5, 13-6, 13-8, 13-16, 13-18, 13-19 
DS Besse Paiva cetetene ten cee cee a PR As sha tA Aon Oe at at Done ee A ad 13-6, 13-16, 13-18 
Sy Be reer eee cere eevee ecrece ser reereee reece creer ererer reer rece reer ee rere e tere ercerce pr Creer errr ec voce terete, Peery 3-15, A-41, A-141 
DiS) Bs ieee erred iret ee rare cece rence be kara met ireiccorccreeertee reererecreer creer cence etree 3-15, A-42, A-141 
DSELVassitins orien nated Nistor eae ee ee eee ee 3-15, A-43, A-141 
DSRA sheeecotteadtievkittvcceteiy gs bears Stave eee eet beaveeatds teobebeds tell veut beaded eerebeirare ll egrets 3-15, A-44, A-141 
DS RAG 2 ie sxitircyaccecdt havens thas pettanetec sev vets conan ier eset ne, deh, peat lee aan 3-15, A-45, A-141 
OAS) YAW cnet cer nerrr cere ero acre eer ce cee nee rc eeCeeEPE Dee eerererecereeeeeerecereerreeecrr er cree rere eeee ceeereeeeree 3-15, A-46, A-141 
DSR sivi.2 a teiedve el iii feelin ead ed ee te 3-15, A-47, A-141 
DSRES2 ig. tain Ane ini AL de i eel ii on itn Ute th te 3-15, A-48, A-141 
DS REV se ieiiietdseescvini adie eA ae Ree a eat vee ede gee rae 3-15, A-49, A-141 
Biol Ue eceereecere reece teccertorcrrcee eee ocorer ee tcrecnrceatecrcere reer rereerererateercecer erreecereeae 3-15, 5-26, A-50, A-141 
DSUBU: tic ascii ciei el aside idea ieee 3-15, A-50, A-51, A-141 
PTE reseed coteurtecPeebepecte teed belea stad cave aud eterna vedere Oa ceacebed eet aed teileeteuntn eects 13-6, 13-16, 13-18, 13-20 
PTL ss = esse fesse Lath Ae ated hese es Noe ot Mee he ae de Meath Meee baad a 2-3, 2-6, 2-16, 4-29, 9-6, 9-8 
Do) Ue ee nee ree eee Dre ec eRe EREEE CEERI ee eee CEPT eee eee ce eee eeerrreees 13-6, 13-16, 13-18 
DNV Becca. cis tl gonesiva aii cong ve esha es aciei ya, Pagel ae yap tas ieee, dig a ee 4-27, 13-3, 13-8, 13-12 
DVB Mies hittin Alaa Stat Ada Set oth Ain ee, A ete le eee ate 4-27, 13-3, 13-8, 13-18 
PIWIE a Sevasncl cescadestcanses ad tach shaseesAageustvens deaetorbada,s Andsnes neste eva deeb da tehaeec tee te haesa ative 13-5, 13-16, 13-18, 13-19 
BY 2 Preeeererrecree Perec eerie eres ce cere terete eee cee neereer ere ereeccoeeererrrencrerencrerereceece creer re 13-6, 13-16, 13-18 
DWE os seotiietie etietiadvk epee lie es 5-11, 13-5, 13-6, 13-8, 13-16, 13-18 
DME ebeadeer ivieeievzevevsuty dees th Daavetetaaeehcvewivdbeteleectael ecetteenedettebely es Oven tebaideastieeytcalnetes 13-6, 13-16, 13-18, 13-19 
DDRII cacctnt gare ce ac Seven cede phe tetas aet tap en it ee eek dale esaa ee, San each a bea There iee ae, Hea cnc eerarve anion me metoreust ee tease thay C-6 
BY .4 BI een eeeerreereeeree eee ra reeercer cera eetr eee Teer eer reer EeePC TT rere rereecee reece ee reece C-6 
PET. Giycvice si ertiedys Oil tines Peli eae anti eter ee C-6 
DXS Dia ie tnita aie adhe eta ee A Ate on i ere ute a eee AS hei ae ate ds aa A C-6 
DXSTG said rein aed a a ae i a Aa tae Aan aed aaa C-6 
BDA RerceperreerereererPerere orceeec rete per reccrntr errecr ec corecre eee ecerreet ecrtcrcceerrmeaerr creer ecrePreterc cece orc er erence ecepeer rere ere C-6 
E 

EG estes teat angah pope eh vere ves qotihie ed sob asec vagal deel coat ney yey decree vaste ough nares va dees oes Yay ag even va er 4-23 
ED etn ccehiie shite Ain A ia as het ieee ate hata 4-16, 4-17, 5-23, C-1, C-14, C-15 
ke [Fe Cee r cere eer eer eee CEPECE PRR TEE Cr CEC EECEECE Dero CEr RCE Tene EEPECRE TEEPE Ce Eee Cer eRe ee merer eer rerrem ene etree 4-23 
Eli oQatig ena Lea a a ee ee 3-20, 4-16, 4-17, 5-23, C-1, C-14, C-15, C-42 
ES ceveece ity eh deeees bits At ee Ras ta lala Alin os aS tated Soca A dat g 4-16, 4-17, 4-18, 5-24, C-14, C-15 
Ondian ........:0:c 3-5, 3-6, 3-7, 3-9, 3-10, 3-11, 3-12, 3-13, A-3, A-6, A-61, A-62, A-65, A-66, A-73, A-74, 

A-77, A-78, A-97, A-98, A-101, A-102, A-119, A-120, A-123, A-124 

QMCNAMN OSS 33 sas ccecessuedvee foeccnss baucdaed eagthede ds guedeee bsaccnay badgetes buueduvy bapcensh buacdid educdech bducdue doaecnh Machih baacdaed ecuecneeeducdeevbiaes 3-9 
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ENdIanness :. .28..0) se ee A ea a Le i del nt de belo 1-2, 3-5 
EntryHi .........:::cee 2-15, 4-5, 4-14, 5-15, 5-16, 5-17, 5-18, 6-2, 6-3, 6-4, 6-15, C-28, C-37, C-38, C-39, C-40 
EEMUY iat Aiea ah ee Suet eau Ati ooh a eh ea has eee acct Seah ia heal ca oh sbabeiMuaots sbeaths uesuuced de tek ah 6-16 
Entry? ative oie Lie a Le, A ee ei ee ee C-37 
ENtryEOn 2 cits atte Ate en aes Ain ae ets ee os 5-15, 5-16, 5-17, 5-18, 6-15, C-38, C-39, C-40 
Et g1¢ AY] Ro] 8 Weeeepererer treet iy errr receerr rec reer cpr rer reece errr eer rer en nee eer 2-15, 4-5, 4-8, 5-16, 6-15, 6-16, C-38, C-39, C-40 
EritryLOd sc 2ectas!beieecd boge dy ievddest sad bbinetd bedeet depicts sagas ce caleede df agnecse 2-15, 4-5, 4-8, 5-16, 6-15, 6-16, C-38, C-39, C-40 
EPGuedenidiniiied: 2-6, 2-15, 4-5, 4-21, 4-33, 5-2, 5-3, 5-15, 5-16, 5-17, 5-18, 5-19, 5-20, 5-21, 5-22, 5-23, 
5-26, 5-27, 11-9, C-16 
ERET ..........-. 2-11, 2-12, 2-13, 3-20, 4-4, 5-5, 5-24, 6-11, 9-7, 9-11, 12-2, 12-5, C-16, C-38, C-39, C-40, C-42 
ERL ......ecceesceeeceeeee 4-16, 4-17, 4-18, 5-5, 5-9, 5-11, 5-12, 5-13, 5-14, 5-19, 5-24, 5-25, 6-6, 6-7, 6-8, 6-9, 6-10, 
6-11, 6-12, 9-2, 9-10, 9-11, 13-5, 13-6, C-14, C-15, C-16 
EREO esieaiuntie en tehilen UA Aa Ge ah eaten eae kee eatin ty ete ok 9-5 
ERA ss adidiadar nies faise delta in ti ead es aidan in aarp ca 9-5 
td 0) 2-6, 2-15, 4-5, 4-12, 4-17, 4-18, 5-2, 5-10, 5-15, 5-19, 5-23, 6-6, 6-7, 6-9, 8-13, 8-25, 8-26, 


8-28, A-2, A-54, A-55, A-56, A-57, A-58, A-62, A-66, A-67, A-68, A-70, A-74, A-78, A-79, 
A-93, A-94, A-98, A-102, A-103, A-116, A-120, A-124, B-10, B-162, C-7, C-8, D-26, D-34, 


D-37 
ERONE RG iectcay 2h eaetecessttetaaees teste ceec eaten oiateer tate hake eae 4-33, 5-5, 5-12, 5-13, 5-14, 5-25, 9-10, 9-11, C-16 
ErrOrPGivasg. ve. iieee Seiten teeve Sabah rela (a hee ape Lea eel Se aes 2-15, 4-5 
EVENT sectitvt tier Ale Sea ei Ate tes shot At ret, eels teat a a Ate Lets 9-5 
EVENTOS ta ceseceetecniti deaheslacees Acetate actinides teva ed eaut debated Adee daateers 4-28, 4-29, 9-2, 9-5, 9-6, 9-11 
EVENT Al cite teticcee sedearstiseetrac tite cal oatewes Seas tis clgectruantde.t me fenae tt acbesd raenidast sad ogeaaebegegt etdbecass 4-28, 4-29, 9-5, 9-6, 9-11 
EXC2.jritiaii sn edie eth eidiatin pies anv wees 4-19, 5-5, 5-8, 5-11, 5-12, 5-13, 5-14, 5-25, 9-10 
ExcCode ........:::00 4-19, 4-20, 5-2, 5-8, 5-15, 5-16, 5-17, 5-18, 5-19, 5-20, 5-21, 5-22, 5-23, 5-24, 5-26, 5-27 
exception............:. 2-15, 2-16, 2-18, 2-19, 3-2, 3-5, 3-16, 3-18, 3-20, 4-4, 4-5, 4-9, 4-12, 4-14, 4-16, 4-17, 4-18, 


4-19, 4-20, 4-21, 4-29, 4-33, 5-1, 5-2, 5-3, 5-5, 5-8, 5-9, 5-10, 5-11, 5-12, 5-13, 5-14, 5-15, 
5-16, 5-17, 5-18, 5-19, 5-20, 5-21, 5-22, 5-23, 5-24, 5-25, 5-26, 5-27, 6-1, 6-2, 6-4, 6-6, 
6-9, 6-11, 6-14, 6-15, 6-16, 6-17, 6-20, 8-13, 8-25, 9-2, 9-7, 9-8, 9-10, 9-11, 10-8, 11-2, 11-3, 
12-1, 12-2, 12-3, 12-5, 12-6, 12-7, 12-14, 12-15, 12-16, 12-17, 12-18, 12-19, 12-20, 13-2, 
13-4, 13-5, 13-6, 13-8, 13-9, 13-14, 13-15, 13-16, 13-18, 13-19, 13-20, A-2, A-6, A-8, A-11, 
A-12, A-13, A-14, A-20, A-21, A-28, A-29, A-33, A-34, A-35, A-36, A-37, A-38, A-39, A-40, 
A-50, A-51, A-54, A-55, A-58, A-67, A-68, A-70, A-86, A-87, A-91, A-92, A-94, A-103, A-106, 
A-107, A-108, A-109, A-114, A-115, A-116, A-126, A-127, A-128, A-129, A-130, A-131, 
A-132, A-133, A-134, A-135, A-136, A-137, A-138, A-142, B-7, B-8, B-9, B-11, B-12, B-13, 
B-14, B-20, B-21, B-22, B-23, B-25, B-27, B-28, B-66, B-67, B-68, B-70, B-71, B-84, B-86, 
B-91, B-93, B-95, B-111, B-113, B-118, B-120, B-122, B-165, C-1, C-2, C-3, C-4, C-5, C-7, 
C-8, C-16, C-17, C-18, C-19, C-20, C-21, C-22, C-23, C-24, C-25, C-26, C-27, C-28, C-29, 
C-30, C-31, C-32, C-33, C-34, C-35, C-36, C-37, C-38, C-39, C-40, C-42, D-26, D-37, D-41 
Exception ............ 2-6, 2-11, 2-15, 2-19, 3-18, 3-20, 3-21, 4-5, 4-18, 4-20, 4-21, 5-1, 5-2, 5-3, 5-4, 5-5, 5-6, 5-7, 
X-7 
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5-8, 5-9, 5-10, 5-11, 5-12, 5-13, 5-14, 5-15, 5-16, 5-17, 5-18, 5-19, 5-20, 5-21, 5-22, 5-23, 
5-24, 5-25, 5-26, 5-27, 5-28, 6-6, 6-11, 8-25, 8-26, 12-2, 12-5, 12-6, 12-7, 12-14, 12-15, 
12-16, 12-17, 12-18, 13-2, 13-6, A-8, A-37, A-79, B-62, C-8 


EXCeptlons 30. .ice, Ae eli, ea A 0 A a, Se ee ev ee a 11-5 
execution pipeline: i243... sia gies bedded ei tials 2-3, 2-5, 2-10, 2-11, 2-12, 3-26, C-16 
=) a 06 lene eeeeer er eer Rr reee err eer ferry reer er rer Certs tren rer ee terre tere ree teres reer prt ree errr 12-14, 12-15, 12-16, 12-17, 12-18 
(alin Pprrrecreeeereeer er rectier eaee cette ocerrcreen rere rcrnc ee ereciecrceecrne re recererereceererecr er rceec occa terra creer ee 12-19, 12-20 
EXAnd2 sities ahaa heen eaetiatied hee eel ah epee ei eae 12-19, 12-20 
EX eceteveetteaeetans 4-16, 4-17, 4-18, 4-21, 4-29, 5-2, 5-5, 5-7, 5-9, 5-12, 5-16, 5-19, 5-24, 6-6, 6-8, 6-9, 6-10, 
6-11, 6-12, 9-2, 12-6, 13-5, 13-6, C-14, C-15, C-16 
dd Oe eee CREO EE CREE ee Orr ERE Cee Ere C eee RC re eet eerie breeee rr eerererrce eer erereererceraer reer 4-29, 9-2, 9-5, 9-11 
EX LA ave dheneiasse tay tay 2 eatin ioe a dae Gin viata ea nee 4-29, 9-5, 9-11 
F 
FG Ri cveses heel palabitanee Scher cela ih ade Phaweecttahea tbe Saree eanenttet eel barton eaeeecbtelnitear eel eal ediayeea tie enties D-14 
ig ©) nO bemereerreeet rrbererepcer chet Cree errr nee cterer epeerprercerceet creer ereecrrche Creer ert recet eck treme ne eereer re Coererr ree creer persone ree or Pree 10-4 
FORG1 a cvcin iinet A iti a ee eg gat 10-4, 10-6, D-15 
PORS Scheele ventabecacesctity tees bilua sith avd Dh viucteta ree eebvarte Stevi geben tiaael ea loebrartens deleted vend tupbhael ae eetiaeenateeeneetas 10-4 
FGtCHAGGreSSe arctic hedsidt voonac dentin tev tueva te hataat ace boatha senvona dh nda dtetuscatade dhaahtad daduadatutuonacn Meaiea tenn dt daatea deers C-10, C-11 
TC) tert eee e Ree eee re reece eee ore Ecre rere erie Cee eC e CEE CEE Eee eee Eerie rere erence eres 10-13 
FGRS 2 aire tives a eieet genie see dari veel edie Sees agai Qa cui nag es Len aera ae eee 10-2 
FLOOR: bts ci stein Si tt Ada th Aas oth A AEA ane als Gata a eats Sein cet et Shas D-23 
FLOOR EAM BH ciseccvenit deatesbaceesd ate aieeotrstadi Aad oivan tits deebes creda peed tivated aed 3-21, 10-14, D-41 
FLOOR EW cca thae t ssler tea st shet rie sided cal gatavsa sogestaetelctrussiae c tatgetaats sabe sh ccauhavak catcneeyts wedegestapect vas wee eseccaes aetapiantvnnes teas D-24 
FLOOR: WifMtisise detest dade eles haath lees Melted aad eg ee ee eee 3-21, 10-14, D-41 
EP CONttOliisccctisevcshcssastcteetadrenestaakebivevaviveuatarecstawbecverstendbeustawavauatavenata wavassiagabtieametnaisa ci wecuatbasvaweranetaes’ D-14, D-15 
Fe to eose tea theses aece eet dean ely epioccathpaevat eas, Hoes ne annie dies east oe, Leh ted teaties Matha eee oe, en ie reesei 4-20, 5-8, 5-28, 11-3 
FPR.....cceeesteeeeerees 2-3, 2-9, D-2, D-4, D-5, D-8, D-12, D-13, D-16, D-17, D-18, D-19, D-20, D-21, D-22, D-23, 
D-24, D-26, D-27, D-28, D-30, D-31, D-32, D-33, D-35, D-36, D-37, D-38, D-39 
FPRS 24 i tilata OG a tA ton te Be te den ald 10-2, D-10, D-16, D-17, D-28 
PP Us peceieigittiedets 1-2, 2-3, 2-7, 2-8, 2-14, 2-18, 4-16, 10-13, 10-14, 11-2, 11-5, 11-8, D-1, D-2, D-3, D-14, 
D-15, D-27, D-29 
FR aseesoct cheethage eh hated Anas se cect eles fleteeSd an ae neediest ited de ents Reet pee 4-16, 4-17, 10-2 
TUNEL SIE a reactiesteecyvcebeset heen eeel eed eeighlaebia neta dhsietetaeeh 2-3, 2-14, 4-1, 4-2, 4-4, B-17, B-20, B-21, B-22, B-161 
FUNMO! SING oo eserecs tea aee eS oces festa hare Leta cease feat iv dag ates ede es te ee a 2-11 
G 
rofl lalcyilaie ererercerecrectrereerenar a rerneeroreer a nercceeceereer terra teeter tere eccer creer er 2-4, 2-19, 6-17, 9-1, A-8, A-125 
General Purpose: Registers cciacesitvsinien ni nde ani nikianidethih neta 2-3, 4-1, 4-2, 4-3, 4-4, A-3 
Global Ditis 24 sais. seey Reese te Bah eden stad ade y save et Ac auears ctiapeviag fauhe tea daeSehd cnet cauaba tend Maur eietenuede ceurdeated tetas 6-18 
Cll eorcepererre PERC rccec ercrcoerer cepa CeirPeerr eree trckrr cecere bert ccerecr rere Peet creeper eee eereeec peetcererecerecener cece erorereererrrce D-21 
GPRIOs fcitinente ieacitieed ead eet vanities hehe eon B-21, B-22 
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GPIRLEN 'stsesssees as cacees geet eee eet sei cee venies ee eevee sees dag svn closed egos ecg bie Ave epee nage enone A-3, D-6, D-7 
H 
llitacviesyadean cdeaunre. 2-11, 2-14, 3-16, 3-22, 3-23, 3-24, 3-26, 4-1, 4-2, 4-3, 4-4, A-38, A-39, A-40, A-80, A-84, 


A-86, A-87, B-2, B-5, B-11, B-13, B-23, B-25, B-66, B-67, B-68, B-70, B-84, B-85, B-86, 
B-87, B-91, B-92, B-93, B-95, B-101, B-102, B-111, B-113, B-115, B-116, B-118, B-120, 


B-122 

alee creer eerceerceercrnenrieererec ree tree tcrece tree reer reece toe reeee etree ccorec nee Cerner 4-2, 4-3, 4-4, B-2 
AW ectiginceacin nein 2-11, 2-14, 4-2, 4-3, 4-4, B-2, B-3, B-7, B-8, B-9, B-12, B-14, B-15, B-18, B-24, B-26 
NIE UMGEF MISS x hecis sivoeeserdcciselvecdces waxhbehndea ead vvxeteteadinestawec site vasceeteraanacuetas tudes reduaeeeeiwvelaendannebvewedabuadiaentate 1-2, 4-23 
l 

IAB a see ete ietd teenie ide ae endear ae ede ode 4-27, 13-3, 13-6, 13-7, 13-11, 13-13, 13-14 
AVS lerpeekeren cere oreecenrerericr cecreceerrecuace terre ce eeepc rreereteerrcepe tr ereere ecrecerrcretcec rece rcccercee terre 4-27, 13-3, 13-7, 13-14 
PAE oes hte ttn aan a Neal ie a An a Ned ae ha ln Ne Ce ttl Meh 5-11, 13-5, 13-14, 13-15 
|B) eee ere ore Oreery er ec rrer ec er rieny erere ereert err cere crap reer ter Orr peer eee eer peer eerera ree tree terre Serer eeererr errs ree rarer 4-20, 5-8, 5-19 
(Gare rreceeerer rere eeecererferterrecreceercepti corceserretcrecerconeteetcurcr renee rercereren tre rcererrrrriecrceerceretcrncceereerrertcrre errnree 4-23 
IGBietkin asain ai el tect bidet eens aide 4-23, 5-11, C-9 
WDD pease veel ceteuebecPhebepecit ies beleeed ah cava Chane veteels ves lobar te Gn cdautebed et aaed vey daleasheediewebeety Pe etalee ne valoeltare hs 4-14, 6-16 
[EE Sadat Meee Pea LEE aah Seah ek Set tele Ae Nt bea ea 4-16, 4-17, 4-18, 5-9, 5-12, 5-24, C-14, C-15 
ft eer eee eereercer errr 2-18, 10-1, 10-8, 10-9, 10-10, 11-2, 11-3, 11-6, 11-7, 11-8, 11-9, D-8, D-12, D-13, D-19 
IPL Saal Ge va Pati liek Lainey e ee een en geek i nt ee eared eee C-6 
Lal Cerner mrecreceey center cerecer Pets recy sirect trees etter eree cee crererereeer eee reeec ene ee eerie cere reer ereerercoe C-6 
WE ee pees eatnch dada eteshcansch ad dasha sucteghay cat dattes deaaocbadase A adeaeiecneate evades bash dasschaeee tiv ageaieauedeea deeds inva, 3ceaeete 13-5, 13-14 
|W Peeereeecrerrecare rere tecrecr ceaee cee perercerc reed ceenereeeeeeeirre creer eer crepe ere reer 4-13, 4-16, 4-17, 4-18, 5-9 
IMpPreCiS@:isic.Ai keen aii fen hed herein a eee 5-14, 5-19, 8-13, 13-2, 13-5, 13-8, 13-20 
INO OX csieceetitvavecente 2-15, 3-20, 4-5, 4-6, 5-18, 5-19, 6-20, C-7, C-9, C-10, C-11, C-12, C-13, C-37, C-38, C-39 
WIND EX races cee ce ee, Send nee nde ce tet event wah oe doe peta cea La eta ent eeica tebe ee ted ceca tte ean Allee eterna teicet ay C-6 
[ate [=> Compepceree pres reer ree eerie cere rer Terran rete ere cern retrrrenecererr eres ceeereereerees rerrtren eer rene reeeree ree eres C-38, C-39 
[Nitsa seve ail eee esl edi aed eandeeceiatian in leeinae  eeieed 9-11 
WAITIANIZ Sinem ots oti ache Cireeee eet Lo eked Gin eaten ol reeked rere aan Ln eee eke bevee taeeeih ofl eee hae 9-11 
IYMUEU EAL ZAPU Gao seis ee Pas esses ea sa se Sa Sa eeee dat ceca pa edna eca Sh by wea cpa danced nua gota tee advan wegdbicedeaededenius gen aaiee Sarena 5-11 
alteliPAlaleleereeerereecereercreeecre rere rcereeer re cece rcrreeerrecrcrecerererence cee rcreceeercrtcerrcrreerree erecrec rec rcererecrrereee rceperr error 9-11 
INFirseetgeties aia aida ta Aiki ie ait aaa a Me dete ani eee 8-10 
IMLSKIO AVE sik sce tieviewepetitcceed aud wet) even ebewebee alia) bt ceue cuban veetaued eed rll ot tout dete etd eile Red rtd all aed eeed B-88, B-89 
INterlGaVEd ss css Leis ihe ee Ee oe ee ee ee i et a B-88, B-89 
interrupt........ 1-5, 3-16, 3-22, 4-13, 4-15, 4-16, 4-17, 4-19, 4-33, 5-24, 8-10, 8-13, 8-25, 8-26, 9-4, 13-8, C-16 
Interrupt... 3-20, 4-16, 4-17, 4-18, 4-19, 4-20, 5-2, 5-5, 5-7, 5-8, 5-9, 5-10, 5-12, 5-24, 8-10, 8-25, 12-6 
INTOMUPIS ta eae intend ue Ai ein Ana ie i ti Ai in a ki 4-16, 4-18 
IN WAL TE ATE a cae vevcy cA cas Beate cesdatiea Ave acee ae p cca eB Decay nce rduty cau Neadeweeti actu dae vad env une tatetuaue Randi viye teaneteedss C-6 
fo Sy eeercener ree Reece ecre rate ne rccee nenrcereer ected cree ceeoeced eerrecocrena tern aerate eerecree aecerrcera tree cere 13-5, 13-14 
ESCST UT = SR ea 2-3, 2-12 


TX 
TOSHIBA Index Ni RISC” 


ISSUGS 5. ic c.ce hice cians Loe eed AL a oe ve ae ea ee 2-3, 4-24, 8-12, 13-9 
Skee pereere eet peer eercee reer er cere cre ri rer creer eer Coen reeeceeer eer renee creer Peer Teco peer 13-6, 13-14, 13-20 
PTE a ee cash cada cet ee ates geese the tga ote eee cd ha tenet tae Ae se aah cautcnesiees ae aha eee aaa 2-3, 2-6, 2-16, 9-6, 9-8 
MO Eres, aoe toes anh La Sl fae ae ee an ee ie en 13-5, 13-14, 13-15 
IMicatacte his sheen ft ils tthe er ea 1-1, 1-2, 1-3, 2-16, 3-2, 3-4, 3-19, 6-1, A-82, A-83, A-91, A-141 
PAE ec eee asinch ces Secs anata da tire ccautt Shaan pa she Panty acdc Beh aadeentt cheated yuact ss soba ste hestuen such cabateadeastealeeesir aa tae hscsct 13-5, 13-14 
PIS Rerecenrre cree reeeciredieee rato cee tates cen rec trene cre eerie cm creoren ert tee terre cernrerrer rene teeter C-6 
IXL DT yeti atin Rane hes Rene heel en eee ei ene C-6 
IEEE tteevete beg eticvedt ecuetc tye ea re tea vateves bene tetasin Qed veheaelsteneetedt tes ayuath beaded ebataeh since la ieen ON vaetlbar detainee lane C-6 
PSP Mis 2 cc sire crea aa eet Sevan sheen pave tec sea oy aac een Dae sap apie ves nce da, ete age sate y este Mead ed ee yong teeter Ieee C-6 
Del WG Teme cee rere c EEE Peer cEeer ce eee nO Ecce reer ER ePPEE ree enc CEPE ET ReecPe Ere CUrE ree teerreceerreprer eee oer eee eeeere C-6 
J 
Jil teeinini ee) 3-3, 3-17, 9-7, 12-2, A-9, A-17, A-18, A-19, A-22, A-23, A-24, A-25, A-26, A-27, A-30, A-31, 
A-32, A-52, A-61, A-62, A-65, A-66, A-73, A-74, A-77, A-78, A-141, B-163, C-41, D-6, D-7, 
D-40 
JAL eae ena eee 3-17, 9-7, 12-2, A-20, A-21, A-28, A-29, A-53, A-141, B-163, C-41, D-40 
ALP estevssdcebvachensbenezecbtalee hates weed thy eaeancteeveyetht ey eehgeead 3-17, 9-7, 12-2, 12-5, A-20, A-21, A-28, A-29, A-54, A-141 
IMIDA 22, teathe teeth eo ioe Sha tothe see Sa eset Meine toe a es eae Aes hot abd eee ei Aa 12-3, 12-4 
8 La = Sooper erence ae ere er epee eeree aeRO ee Cee ERO ere Cer ee nae eee cnee ce cee cere rreerericece reer erererree eee racers 12-3, 12-4 
JR: Seer ighaely 3-17, 9-7, 12-2, 12-5, A-17, A-18, A-19, A-22, A-23, A-24, A-25, A-26, A-27, A-30, A-31, 
A-32, A-55, A-141, D-6, D-7 
ITED vase oan caste eia cat ecvadeceay spc aas sev iasepsia edunt a baduct Aceaausi van sotavic Teahnes onda, b Pen cadirh ad dates avadeadad sone a ceaesieah es 9-6, 9-8 
K 
AO Rea PeE re Ec EEE rere Te cere ecm eee are 4-23, 4-24, 4-29, 6-7, 6-12, 9-2, 9-5, 9-10, 9-11, C-28 
KB ievvoeteeelenieil, 6-2, 6-5, A-17, A-18, A-19, A-20, A-21, A-22, A-23, A-24, A-25, A-26, A-27, A-28, A-29, 
A-30, A-31, A-32 
Kernel.....cccccee 2-16, 2-19, 3-20, 3-26, 4-16, 4-17, 4-18, 4-29, 5-2, 5-22, 5-23, 6-1, 6-6, 6-7, 6-10, 6-11, 
6-12, 6-13, 9-2, 13-5, 13-6, C-1, C-7, C-14, C-15 
KSC QO nel aitasts Auten aati eit atte a neta aan Ba ate ts ae 4-24, 6-7, 6-12, 9-10, C-28 
K SOO ee iscsdec esac eet baas a eaga desi ae dua vas ccae soins tage ce eS es aeagy daa avn pes daaan ted Seva dageGteetieas a ieraageetennae aera 6-7, 6-12 
Cio) 0 |e Peper peer eercepcl er rere enc cence terrace reece reer erccercerr rer eeee orca corre ree oceep eer recy 2-16, 4-9, 6-1, 6-7, 6-12, 6-13 
KSSOQisssgccietescues duet Naacsthedysthaaccedeelees dhyahdededth headed leccuds saesiueneedesssansevdbtde sted oan Meet Getaedeeevieh antaiots 6-7, 6-12 
KOU iseeee eeivessietee allo ates eects 4-16, 4-17, 4-18, 5-2, 6-6, 6-8, 6-9, 6-10, 6-11, 6-12, 6-13, C-14, C-15 
KUSCOjrea-fochie tes ea te fee de eteseate Nemeth Sasha iee vad Leta daeceecnas Umea i eee betel eae tS 2-16, 6-1, 6-7, 6-12 
L 
ES Wepre ee erreerececa eer reece re perrrcena tree tetra teererre ered treme e 3-4, 13-8, A-56, A-141, B-163, C-41, D-40 
LBU shiatsu anand hae eines nev ahaha eee 3-4, A-57, A-141, B-163, C-41, D-40 
DD satin cacchnses cuautsl sae taiaan es Vacea iat at tate Raa atv sea adden magn does 3-4, 13-8, A-5, A-58, A-141, B-163, C-41, D-40 
i OF ere eree eereree eee meteor cee errereeereeceere reece arr cee ere 3-5, 3-21, 3-26, 10-13, A-141, B-163, C-41, D-25, D-40 
LDL tained ee de 3-4, 3-8, A-59, A-60, A-63, A-141, B-163, C-41, D-40 
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LD Pie geet el ae vd deen gee asReadek naga neyshdaea goiden de tie teas 3-4, 3-8, A-59, A-63, A-64, A-141, B-163, C-41, D-40 
Lata hee ei tneteabdevar itt uke eit atten teatas atitasekde nieive 3-4, 13-8, A-67, A-141, B-102, B-163, C-41, D-40 
DP eres ie ghd etsuidet ast bit sclecageahaahe teaacteouuledaasdévs sdleoagahasdesinsscensdshstesugegtacessbnanusie 3-4, A-68, A-141, B-163, C-41, D-40 
Nesey ie he eae ganda ea enrea A Ma deel geste doa dnoaa iach auld ee hielo eas 13-14, 13-15, 13-16, 13-18, 13-19 
LWA r. Sevsactteeevialscuveaeebtetsch cade Hud Tee Sean duebescitetstansttnsteanbecdhiechtulysuau Austuedaevaunhdsyetl chastise sity sgentcats 2-11, 3-17, 3-18, 4-4 
Wiles trates adsenchedeanter seecmstedtaut sacs ecteanedachataticcehssacda Mcvassusatedtaiuathagautusnads mabageamhaes 1-2, 3-4, A-142, B-165, C-42, D-41 
MDDS iee stabi stae aetevsa bodesths und eh slaind dest teeoadedeadnnd hanewagees sh) enon vee nodes landed awe eediest tasseeage 1-2, 3-4, A-142, B-165, C-42, D-41 
LO lacked 2-11, 2-14, 3-16, 3-22, 3-23, 3-24, 3-26, 4-1, 4-2, 4-3, 4-4, A-38, A-39, A-40, A-81, A-85, 


A-86, A-87, B-2, B-5, B-11, B-13, B-23, B-25, B-66, B-67, B-68, B-70, B-84, B-85, B-86, 
B-87, B-91, B-92, B-93, B-95, B-102, B-106, B-111, B-113, B-116, B-117, B-118, B-120, 


B-122 

LOO) saved aleve a inthe aan ae date ee eee 4-2, 4-3, 4-4, 6-16, B-2 
LOW sA.ceiiaks 2-11, 2-14, 4-2, 4-3, 4-4, 6-16, B-2, B-3, B-7, B-8, B-9, B-12, B-14, B-16, B-19, B-24, B-26 
LOAUMeMON)........::eeeeeeereeeeeeeeees A-6, A-56, A-57, A-58, A-60, A-64, A-67, A-68, A-70, A-72, A-76, A-79, B-10 
Eo, erceenereerereeceecercr certee ore rcrecLreerrece rrrecr neerrertcrec reece eer tcepcr erence erect Pere 2-17, 4-32, 5-11, C-11, C-12, C-13 
LOCKING: cevdeigecciteent die ticenccvdedeecctteeeh dda leccehateneccvSude cnc ddudecnetatbeecvdelaeecbueeeh died ccvdudeaeeuuetivtiaeecvilleestseenedielede 2-17 
lOGIC Al DIDO. iegeeesetebesttceet ei weal bvsececeetewenetetucidd aad ty ed colaeneee ete uepee tale 2 vedae lobe sevecetbeuavetetachte eden tet 2-10, 2-12, 2-13 
Qi seni ess ee eee es te a ot te 3-5, 3-25, 13-8, A-141, B-4, B-10, B-163, C-41, D-40 
Le eee ree rarer eee rere ce Pere cere eer eer Cree eee rR eerre 4-32, 5-11, C-9, C-10, C-11, C-12, C-13 
GU) ass, Mesh a ee AL ee Le ae een ee 3-14, 3-26, A-69, A-141, B-163, C-41, D-40 
LW tre cose eter Saath Ate is tate a cee ee a 3-4, A-5, A-70, A-141, B-102, B-116, B-163, C-41, D-40 
EW Glisten lesen dina ven eecdetnd dad aeaeea deni dee tee: 3-5, 3-21, 3-26, 10-13, A-141, B-163, C-41, D-26, D-40 
DWV. Ges sade? he cat uceets bogs Lage Eras ede ast saueaty neds dt _engecheos wederaats wages annecd oewi decd rad baeawricaraeesiauees A-142, B-165, C-42, D-41 
IW inseci eidatieatvh npinied deh aie 3-4, 3-8, A-71, A-72, A-75, A-76, A-141, B-163, C-41, D-40 
EWR eee cots Qeey i eeecerti os aiavedivt dee a teenth ren hiven 3-4, 3-8, A-71, A-72, A-75, A-76, A-141, B-163, C-41, D-40 
DWV alah ascettcereieeae Moe acne tend atest este oe, See ecg tho eat lect tte echt eee 3-4, A-79, A-141, B-163, C-41, D-40 
A Oe er rere TERRE ee eT ERETEEL CET Meee Lear ere ECE Toe ETO REPEC nore ence erence 2-13, B-4, B-90 
M 

MA Cassis AE aie en Get et ie ee ee 2-11, 3-16, 3-22 
MAC Od veil siicertcebasebegetecee ined eet lett be eecttesee lead evil Dh varaecehenepetoteaed beaebl oh envied ibhaee edd elo eee eels 2-11, 2-12, 2-13 
WN A Gu eeerereeoeerepeeereern rrceeerecrrerecrrret rir i eeeeeroocceeecerenrper crececer cere cercrecenr te rrecteertrncrcncorrcreeerrcree ere 2-11, 2-12, 2-13 
MADD: s:.:ec i atieon Ae AS ete WA ee 3-23, 3-26, B-3, B-11, B-13, B-163 
MADD Wecetstcstacvtsceetieieelet haarendiae eae wilh enectenepeteed 2-14, 3-23, 3-26, 4-2, B-3, B-12, B-14, B-163 
MAD DW rive taser Stein eens aa ete eed La eee ee te 3-23, 3-26, B-3, B-13, B-163 
NFAY) Wh ere areerr cere ee nce reer ee ner reece epee erie cere enero eee cr treercr tree 2-14, 3-23, 3-26, 4-2, B-3, B-14, B-163 
Mask ......eeeeeeee 2-15, 2-19, 3-20, 4-5, 4-10, 4-16, 4-17, 4-27, 5-9, 5-24, 6-15, 13-3, 13-4, 13-7, 13-8, C-20, 

C-22, C-24, C-30, C-32, C-34, C-39, C-40 

MA Sass Secacrey ae Mata dei vs Nc ets k cae letey ave vacanetiuaea bade -Beaceebhet deateens es dau eisedioredetgatanp abit teed fetivhaa tenets 4-10, 6-16 
M@SKADIC Ss sctccidicizescbede dati adstctcbnsudiccatagade ads sae dus nace ct dackuiavasdecsceaidbgabeasndck de das aideaus budaed aduauat caaelateaubadeseeantadeadée 5-8, 5-12 
MAX oe peteivtenda tactic divide giani laa Roe et ae ee ech ean eed eee 2-18 
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MB wxtiiepees he eed ates ees ee ae na Pengo ena oe ee aaa ieee 6-2, 6-5, 6-12, 6-13, A-52, A-53 
MFO, ite Gh iets ei itn ee Ar ale Ain aa eee uel A nae a a A Lis ta ah ir a a a ole C-41 
IVE BG ase ee cat aatesc tet atte cede ath al patel a ceet act ects se ots cteraucee Mates ea gta gudcn ed den gee ste steed 3-20, 13-4, C-17, C-41 
MPFCO 1.2. aie ann eG ie en Gg Le eee 3-20, 4-1, 9-3, 13-2, 13-4, C-18 
MEGA schtte Gh cei betas ee teat el tin aa Sahat Og a ee LA Ae 3-21, 10-13, D-27, D-40 
ME DAB sive ccvecs ch acdcvnrectuceate vac seth nedaunze stele Bavead eauht shade eavanca tet atuags Anat gasnch qesuee ae eaeetiadearaaeaead 3-20, 13-4, C-19, C-41 
NIB Y. 2] Merrcepreeceeccereen rire rpecrcecerrererceercetprertirececrec tecerred errr erect rereeer eeecerr er etter eee eer 3-20, 13-4, C-20, C-41 
MPDVB 2.03 ue sateneied eevee having ee ets eae ee ee 3-20, 13-4, C-21, C-41 
MEDV BM eeceticceseivoertety stb ea cetnivehatistegertett nate natelaeet vente bhceat eet eaves beeen teaeees 3-20, 13-4, C-22, C-41 
Si lelg| eeeereecereerccerrccer cemeeneeeeriet cero nearer eer reer ee corre nate ere Ce men tre reer eer erem Creer ere ry 2-11, 3-16, A-80, A-81, A-141 
YUU bl Reetenereceree reer cree erereterr ce ree eee renee rereer creer recone irere rarer eeeereccee 2-11, 2-14, 3-23, 4-2, B-3, B-15, B-163 
MBIA Biicctient adie yic ee inv aan a I ee ee ia ev 3-20, 13-4, C-23, C-41 
MBIABM i esta atavsitn Ae iii aate nite ail aaac iin ated ferent eatin tevin bates 3-20, 13-4, C-24, C-41 
WY Lil El @ Reeeer ee epee freer tereere Meee creer terre Secret terete prereerr per rrr per eepr eer mer etrrrr peer terse herr eer eerie 3-16, 3-23, A-81, A-141 
WY [pel © by eee ee eee erent rerecercrencerrecertcere rerreerertcrceerrrre rarer reece ee aererccrecce 2-14, 3-23, 4-2, B-3, B-16, B-163 
MEPG.isevienity piel aia neti i ee eed eid a ie 3-20, 9-2, 9-3, C-25, C-41 
MEPS ivseeSeteeel ce etiiecth deed belvael ob ceyee Sa eieceetnce planed eet epivepystbebelee tae tel ool eaalaeiseepetety denen 3-20, 9-2, 9-3, C-26, C-41 
MESA. cotciah teitttet totuat ica hosts sees Stet ents et ee te 3-25, A-141, B-5, B-17, B-20, B-21, B-22 
NIN Receeeereee ree eere creeper ne ener eee ence reeeerer rece rere recente CePeCe RET recrrce eee are eereee rr ere reaver ree ar cere eer 2-18 
MIS@lIQne ee. ieee ieee eee kc eee ee Gece seee dea eevee qoute va deue sa genee lay Sivek qeuneg vecbielvapeeue haeeevel ual eeveviaene 3-8 
MISAIIQNIMO Nt csvccts sate core eer ett ele hie Cog Re St cal 5 an Gra aa tls aa AE nN on ee ee bet C-8 
IMIS PEGGLE: Sees recec creda deste veacaee hagausnevens eta van dads oachadecsovahcatecvag cesbeshadesuelcaeceissstaatest ad teebeslagh ennadesa@ieasaaseendes 9-6, 9-7 
Yo eeeeeeeneereerrececceccrr err eeecercrre er erpcreceree terrence cere srreece rer eet rete ererece rears 2-17, 4-17, 6-4, 8-8, 9-7, 9-8, 12-6 
MUSSOS s c2cdeleicchedigegecviuscedesde Me ccdel sec ctvee deg vaso an ddel bce ceveeoenclddeuce dees cad edelagdeees en edddveae dees deers enedeelecnds 1-1, 6-17, 9-9 
MIM aeteteeeetebtved oes vevceeiadvesth Peeveretneyelaveieiettetel tera De aatebive hae tutetl 5-22, A-141, B-163, B-164, B-165, C-41, D-40 
WANA O toi tescestzcecsiceay. Sonate dened falta tte cece ee, ang oolea ero ice ee, beac PaaS eeeh aeenamhe merece tenet ond heer anie edie ane B-163, B-164 
NL eee eee reer nee errr ees ree eer arr rere ee crac rere reer eeerere rarer renee eceeerre ce one ere er rreree B-163, B-164 
MMI2 iyi.5 eect aie A inva ahaa ote ak eatin ea leet dee ee B-163, B-165 
MMS uratcetate aut coh east Be teh ieee he teh ieee atin fs aa caver eth ied aia B-163, B-165 
MIM Wai acteceedeisvec ait eae ated atte Saad dead Sev a ead eee 2-3, 2-15, 2-16, 4-5, 6-1, 6-14 
INO Oat ovtsS ots sees aguttents Seca teaseteava Socesstengatenasdeceas tessa tensa Ooctantiagetin cs seamsaeashiarece A-38, A-40, B-7, B-9, B-66, B-68, B-70 
MOV teiglinitaiie tidak eta ae nae Me eee 11-6, D-28 
MON STIMU ecirectacctscbereteee Raed ert oh caren ethegee veel Wh careectbenepetteaeh betel oh eave bweeetata sees eeh tke ald earetes tee eeutet eg 10-8 
MON st nts ste ect Sepik os epee cea Sch etc tate eee ha a esis tes ec he ae a 3-21, 10-14, D-41 
WIOV Oils cces AY og cates etek sleet tacecahh a sttecnen et heehee eee h feted es es Meng ec ee SE AG ca kd sora ah ea ste ee ait sa cece ica ath tied 2-11 
MOV Nev secutive Lae ea en Le aie eal ee 3-19, A-82, A-141 
MOV Ze osetia heats eth Ain er Atte in Ge i ec a er Aine tn oe 3-19, A-83, A-141 
MT Os ere Sa aus as ata Eda ee eae Balad ditigr das Me ed aa hc aeene (ty catia teny dada Mia deat ee Buen saga Sia deene nasaltes demuahee Destine es C-41 
NERS] cl Cone eereeerecnc en tee reece eere eee reece arerreeeererca erence eens 3-20, 13-4, 13-16, 13-19, C-27, C-41 
MICO .wia tincet ieee hie b heen ett ave He eee ee 3-20, 4-1, 9-3, 13-2, 13-4, C-28 
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MIG A iicecig ht eevee aes eeeries he epi ev eee a ede coe yes a vee eente 3-21, 3-26, 10-13, D-29, D-40 
MIiDAB ascites ata er Sit iin ae tte ce Ga ae, ek ae in aaa A 3-20, 13-4, C-29, C-41 
ITI AB i saga eee cies ee std easel te se cot Meck aah a oe a hed cate Ae set gtiets 3-20, 13-4, C-30, C-41 
MTDV Bovis ntaialiee Lae ee an Pe 3-20, 13-4, C-31, C-41 
MTDV BM pte hettietts din ta er Ada aioe ain Ge hee A eG Shee nie 3-20, 13-4, C-32, C-41 
YN ere eee ere reer eey errr errener per rer reererer terete coke cere te eer erpeeerert rereter r rreeerr eetr tr ee ere rere 2-11, 3-16, A-84, A-141 
NDB aN eeerceecerteccosecre nrtrernisecrcecerecrtrentt ceecrca rere recent rrectecr ce rerrererecres 2-11, 2-14, 3-23, 4-2, B-3, B-18, B-163 
MTIAB 3 sessed eee heii Meee eee eee ene eis 3-20, 13-4, C-33, C-41 
MT ABM ieiScceaccceseivoeetede esti blea ret bueeenteebedr etd evasdbbeavpeeens iveaebe lee eye bebvetads rivteetehnasceetes 3-20, 13-4, C-34, C-41 
YR © Reeeeceeereeeerrr ce Coen peerrreer rrcens reecreeetrerr errr rer perreer rer cemrn perce cer ysemerer cree errr errr eee 3-16, A-85, A-141 
YR © Perec ceeencee creer ere erercrerree cert eneerecrrcere rei eeererereererert merece rrcceer reece caeeeeee 2-14, 3-23, 4-2, B-3, B-19, B-163 
MIP Gya.8 etc ccteave ia ee ee td 3-20, 9-2, 9-3, C-35, C-41 
MIPS iaeteata duiien each i ate aaa este th ate ta te 3-20, 9-2, 9-3, C-36, C-41 
MA SAsg. Saciiiais cane aa ade ee ate eae 2-13, 3-25, A-141, B-5, B-17, B-20 
MESAB 8 ctze tes tecceines tet setae sr atoshee shite ceecesterateres agutexte 2-13, 3-25, A-141, A-142, B-5, B-20, B-21, B-22, B-161 
MTSAH visa aki enn hidicie niaedia leh aa 2-13, 3-25, A-141, A-142, B-5, B-20, B-22, B-161 
MITSAXGvedlosteentntitevesth ee hehle deb nearest celts cncaee eee e state vel bed veblchy eiencabbenecbtenes wade tate eulopedarleesteeeuuaiha, B-20 
MW ecertave teastc Aeccet sett cd Soph ty iaet Shot et ee hn ML See de Nag Se Na oe aes Set Neat te 2-18, D-30 
VLE 0 a) eee ers neaee eeree ee ree reece en errretrecr ere er ene tree cence prince creer rere eearr reece tcene eeee cererreeerese area 3-21, 10-14 
MUL miftieish) igh tee ieee la eae Le eek ee hae an aaa a D-41 
MUL Tsen.tte te ti n ata as Si iat eit Ata eds 3-16, 3-23, 3-26, A-80, A-86, A-87, A-141, B-3, B-23, B-25 
ME Te etre cea vecte spate deaute vaace ca Acdeaesieenecmedndde hs eeddeeetne tetiabereeys 2-14, 3-23, 3-26, 4-2, B-3, B-24, B-26, B-163 
NU eeeereeeeecreerceecrerenrice teecrer n erper ceccre re reenter eeeereereeeet rere ene core renee Reet eee ere rere 1-2 
MUultiMasterssacs net each Mattia id RE ith ea tied htietate Masel ealied Meee wade ane 2-18, 8-2 
MUIIME Clay. rsescosededs cessed vavitedeaseensenencacede neces Beye vtiteldntteiclvetabbeacetniveleansedts 1-1, 1-2, 2-3, 2-6, 3-2, 3-4, 3-5, 3-23 
Multimedia... taf snesin nates iveat He yearn evapora 2-3, 2-14, 3-5, 3-22, 3-23, 3-24, 3-26, 4-2, B-1, B-3 
multiply... 2-14, 3-2, 3-4, 3-16, 3-22, 3-23, 4-1, 4-2, 4-4, A-8, A-86, A-87, A-125, B-11, B-12, B-13, 


B-14, B-23, B-24, B-25, B-26, B-84, B-85, B-86, B-87, B-91, B-92, B-93, B-95, B-111, B-113, 
B-118, B-120, B-122, C-16, D-30 


Multiply............ 1-1, 1-2, 2-3, 2-6, 2-9, 2-11, 3-2, 3-14, 3-16, 3-21, 3-22, 3-23, 3-24, 3-26, 4-1, B-1, B-3, B-5 
YL FO eeeeeeeeeeecceencr rececr rec tence erence rreereceerccerrceeceec cere Perera eeerereere 3-16, 3-23, 3-26, A-87, A-141, B-3, B-25 
MULT UP eoeeteniiatieonAnkiel wit hai til iien a eae 2-14, 3-23, 3-26, 4-2, B-3, B-26, B-163 
N 

INGIN ieee Sete aati cae eee An Gre ge A re ae os a nts 10-11, 11-6, D-8, D-10, D-11, D-12, D-13 
INAINS sc: deeversiecceue seca dens tiedeaeersieveeaiceeteaeecnecnadindaethas ede dS tecpavieddeetis ede ea icceuneo totes bahigavinedeb acevesnscesenietecdeaeh 2-18 
NBE iieeyephis Sh eae ae ae eA a ven oe al a eee a ee 4-23, 5-11, C-28 
INE Gi ca eee eta hl a te a ee bl A A ton a a A Lee Oa tn Gad 2-18, 11-6, D-31 
NEG fits: 5 oaates cea cpsie deans vad BAnedavare aidainas avestchsadteare Resvdtiadeatan beset qaedie Baie aatva tee 3-21, 10-14, D-41 
IN cle [2 1: Bart eeecpre cece cere ener rrcrecr cecererey rccecrrrn tree ctcerere cee peer cera ererceecrercera reper 3-21, 8-3, D-2, D-31, D-32, D-33 
NM ccc ietitsneaieel. 4-17, 4-18, 4-19, 4-33, 5-2, 5-5, 5-7, 5-8, 5-9, 5-10, 5-12, 8-10, 8-13, 9-11, 12-6, C-14 
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MOMMASK AD Cees, setae eee cecal ede deees eves gedaan ecg sce ees Gece del Seve ehteeevl teleedei device uesqeute vel deuyebeeeegeeceeeaad 4-33 
NOR fat sett Atti Aon ea hea tet hin a ta en ong a 3-15, 3-25, A-3, A-88, A-141, B-4, B-124 
INQUITVANIZATI OM ics det ee alec see eke shackle neta lec sta cee au ha st eceede cntiat saheeua stu ceeaht saree vents ttrcuws sacha seh ictayats ta kuaauee trad doek i 2-9 
NOT ws jac. aes ee ee i en es 6-2, 13-8, 13-20, A-3, A-88, B-124 
NotWordValue...... A-11, A-12, A-13, A-14, A-38, A-40, A-86, A-87, A-110, A-111, A-112, A-113, A-114, A-115, 


B-7, B-9, B-11, B-12, B-13, B-14, B-23, B-24, B-25, B-26, B-68, B-70, B-93, B-95, B-113, 
B-120, B-122 


NullifyCurrentInstruction ........0:ceccceeeseeeeeeeeeeeeeeeeeeeeees A-8, A-18, A-21, A-22, A-24, A-26, A-29, A-30, A-32, C-5 
O 

OliSeticciaue dis tine tiki aden tele 6-4, 6-5, A-62, A-66, A-74, A-78, A-98, A-102, A-120, A-124 
ODCOdG toni ee re ees ee ee 2-16, 3-9, 5-22, 6-1, A-2 
OpCode.........:..0 3-23, 3-24, 3-25, 6-20, 9-3, A-141, A-142, B-163, B-164, B-165, C-6, C-25, C-26, C-35, 

C-36, C-41, C-42, D-40, D-41 
ODerand aids ctiwa din ae aeeeiee 1-2, 3-14, 3-22, 3-23, A-104, B-1, B-3, D-1, D-4, D-31, D-35 
.@/9)-1¢- Lolo leeeeeeernercecer reece eect crce er erecr ocr tetber tree rc recercerpececcrerttcr ce cecnerr ere re eeterece er err ery 2-4, 3-14, 3-15, 3-23, B-3 
OR. eiiea aves 2-9, 3-14, 3-15, 3-25, A-3, A-88, A-89, A-90, A-139, A-140, A-141, B-4, B-124, B-125, B-160 
OR | fect cheudgetevene tiyetgeteste eel oe ia thene set antedy leche veviateieteh een deere 3-14, A-90, A-141, B-163, C-41, D-40 
OW cath Mechs feet SNA baths BS, Nadya a Meng Mee SL Nhe ee SA ne Aa es Nah 4-20, 5-8, 5-26 
Overflow... 2-9, 4-30, 5-2, 5-8, 5-26, A-11, A-12, A-13, A-14, A-34, A-35, A-36, A-37, A-50, A-51, A-106, 
A-107, A-108, A-109, A-114, B-31, B-35, B-37, B-39, B-42, B-44, B-144, B-148, B-150 

OV EREEOW 2 aiacs ut Atanas a i et an a a oa Se A a te 5-5 
1A ol Rbereer error eee rere reer eerrrer rrcrerer per eerie rr pce erape etree orreeer price tere eee rere nee rrrem rer rey reree reer: 4-28, 4-30, 9-2, 9-10, 9-11 
P 

AO] 4 a ene eee ee PEP PPE eee ere rere eerie ce area L rence ene ener ae reee cea rere cers 12-3, 12-4 
POEXEB wi syetactiatys dieu ioeiaese bil edie eta latin ee inva eats ee 12-3, 12-4 
PAE XAG: saicrepice ee Sons copie oeiasae tht rea et Hoes ac sth neh gusceter eco ee None ceed leet cade ot ioe wath att fuse Mepmenns 12-3, 12-4 
oa (4 2) ee eee ere Pere eer erreer PCE Cee Career ee Tererr Ree EE reel erence er ner erent eee ercrree ceric eeeeree eres Terr 12-3, 12-4 
PAC Sectiviteelendiaty a Riv iinet ags eines Pave elie C-6, C-7, C-9, C-10, C-11, C-12 
PABSHisintit asta atin atti eG din eet aie in teeta fave aabad doves 3-24, B-4, B-27, B-164 
PABSW iS acid seated ned a Ge eee ah eee aes 3-24, B-4, B-28, B-164 
PADD Bist iecsstsctascs sectee negse axa eceus cutee tense beeeusteats feavs bocaat ct eoeepes trate taste odes ts ceetante bea pegesaabets 3-24, B-3, B-29, B-164 
PADDH icin tn dia sania eet ais oi eis Mite Mt 3-24, B-3, B-30, B-164 
PADD SB iieesieestecvieveyecoteescaed ect cnvencebasevetehaget bb eel haepaeete teeta else aunt enero beet 3-24, B-3, B-31, B-164 
PADD SHheisic vaicnah Sethe oii ced lege eed Shot Moe ae Lee ee Ae ie 3-24, B-3, B-35, B-164 
af 1B) BES) Reeerererpeecerece erent ee erree teers cree Perera reece eee meee rere eeree reece ee eearcnceereer cere 3-24, B-3, B-37, B-164 
PADDUB iis. igs ieee Lae ea ae eo A ee ee 3-24, B-3, B-39, B-164 
PADD Wis. sities Ata ee Ah een eas al na ee a a tel 3-24, B-3, B-42, B-164 
PAD DW secs ie. sereceevace Ahh wees deste cad atye de banat eetyahaustehdead steers Geen das teadeie taanenadteeni hae 3-24, B-3, B-44, B-164 
BNO) Rerrcerareereeecreeectre rare rece berceces peace ceeccc reetreccrirenecerrcena erneeeoeecece a ence reece 3-24, B-3, B-46, B-164 
PADSBH)iivisitatiieeitea hl aadiica ta ie ec ee es eee 3-24, B-3, B-47, B-164 
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PAGS sis teeeascicorace ui i isen yagnc ti da heauubuldesiiechaattedhldudeidenccntvssdyauiih uate dudesuagebuagettaes jena 2-16, 4-8, 4-10, 6-16, 6-17, 9-7 
PageMaSkirs.cudcettcbits tesateavnendeeteechseesss alts Sevienest de cbtete epee 2-15, 4-5, 4-10, 6-14, 6-15, 6-16, C-38, C-39, C-40 
PAND 2. veiaests sttassdehdcnciet csessnashastadbhs cusuiitessaiphaseougachaanethavcccnadshaslesapegtacnadhnsaedivesucna iat asesugaa slats 3-25, B-4, B-48, B-165 
PC. akties 1-2, 2-3, 2-6, 2-19, 3-16, 3-17, 3-18, 4-1, 4-3, 4-4, 5-12, 9-10, 12-1, 12-2, 12-3, 12-5, 12-7, 


12-8, 12-9, 12-10, 12-11, 12-12, 12-13, 12-14, 12-15, 12-16, 12-17, 12-18, 12-19, 12-20, 
13-7, A-4, A-9, A-17, A-18, A-19, A-20, A-21, A-22, A-23, A-24, A-25, A-26, A-27, A-28, 
A-29, A-30, A-31, A-32, A-52, A-53, A-54, A-55, C-2, C-3, C-4, C-5, C-16, D-6, D-7 


PC tracing). jssiveswie kisah init ia edness: 1-2, 2-19, 12-1, 12-3 
PCE QB feecssehiacetenearetieteis elven Re neetis nage teed eeantaiias betgeteie ated eentehes seedy vteeteas ree 3-25, B-4, B-49, B-164 
LO] =| © etree netpeere ee Ceeren reper rt cere perpeeent reserrrer eer entree reser peer enrrer ect er er heer ea eee tee ere 3-25, B-4, B-52, B-164 
PCE QW isch secure tan eek ay ee teed ak eee ten a ea an te en a ee sane ote ah yee oe 3-25, B-4, B-54, B-164 
PCGIB). 2. dteatiene iil ahaa tint. Aan eee ee es 3-25, B-4, B-56, B-164 
POG Tile wee vcrctan ct in atin ae ata ee eam te bl Ne oat ee a ead 3-25, B-4, B-59, B-164 
POG TW? catisieisces tian Wi aaa cee ea eae es ee seed th nade 3-25, B-4, B-61, B-164 
ad ©] 5) (ls beperpecerereeeccere Peri rorcerrce per pte coroner eecrecetrenateetcerre reece cr rcrnccer erate orccrcer emer er reer 3-25, B-5, B-63, B-165 
POPY EDirtics cases niin aie ai glee et aoe a ait alae a Lees 3-25, B-5, B-64, B-165 
PCP YD est cactecPhebepecety fend beleaniths cava Ohaed tthe eee eat aedntete hia alone tte 3-25, B-5, B-65, B-165 
PIDIN BW vesttc thse ces hati ft aa hesthe aeons Sats et batts See She et Neat as as 3-24, B-5, B-66, B-69, B-71, B-165 
a) WU ae eee ore eerer cere reer cere rare eee ecrererree cae Cree pececrt eer reece cere tee ereee er 3-24, B-5, B-68, B-165 
PDIMW) ecg poraiiee aise toned oe le an vara va lah us daa ee ae ee 3-24, B-5, B-70, B-165 
A= 10 PREPRESS CREE RE CEES CRT CEE CRE ee Rr Creer MneCr ERE ere pe eee eee PRR eT me ge ete Mee ee ee 2-15, 4-5 
POT C es cecssh ck cactesceateaticeadeattes saves ce caiasvaise dag east heaateateatseys deaths iadasshearuietestaiadecien dears ave ite 4-19, 5-8, 5-13 
Performance ........ 1-2, 2-1, 2-15, 2-19, 3-20, 4-5, 4-17, 4-19, 4-28, 4-29, 4-30, 5-2, 5-5, 5-7, 5-8, 5-9, 5-10, 
5-11, 5-13, 9-1, 9-2, 9-3, 9-4, 9-10, 12-6, C-25, C-26, C-35, C-36 
PEMOFMANCE:MONMOM ee ieites cece te beatttevbertuttebeides abl venetebeanentovdeines dein dhbmet cen eluttebeluate dvaul dud saul ob baaenevientieid 3-20 
PE X GU chester Seeks Sav sae phelicetirey ate aes tog nied ep ites Shae ten efmatrevni gt les thence Mean eecstcees 3-25, B-5, B-72, B-165 
ea 4 Op hl rer eeren cree PReC eer ra eee cere bere ePreere PERE Ree eer erceree renee ere ee creer eeeereereire 3-25, B-5, B-73, B-165 
PEXE he. aitectietes ik iain ee elated eae a veer ad Se 3-25, B-5, B-74, B-165 
PEXEW 3 .ahnata ain aa eatin Bee ain tenia tn atte 3-25, B-5, B-75, B-165 
PE XT Dose clits cedt sreeteease Seale ate iced even heed Meh ea ee we aes eee a eae 3-25, B-5, B-76, B-164 
PEST EB sztaccsrasetasce setae pease caren secant satus ease beanwicts Get eauantastats Oeste: teeateassceteeresuetante teat eegsesaa te 3-25, B-5, B-78, B-164 
PEX TLE iii itie sian eet as en aed cee ee Medes 3-25, B-5, B-79, B-164 
PE XE EW. ech cavtoevbeneyecoteseds aed eet cpvencebasepettend evil var ecevbieseetetean shad el hha eee eden 3-25, B-5, B-80, B-164 
PE XePUB ies tte een eet is Mita Mert Nath ben etas Nig Ae Lean eee a ee 3-25, B-5, B-81, B-164 
a0] LU a Herren en err CECE Reece Circe rere etree Peer reer eee Trer eC nec er er eer Cree Tire erect eee eerererre 3-25, B-5, B-82, B-164 
PEXTUW. 22. eaielavin Laie A en ee es ee ge ee 3-25, B-5, B-83, B-164 
PEN Sevseece ite sittin Ata au le A ire A, eet oe 2-15, 4-5, 4-8, 6-16, C-10, C-11, C-12, C-39, C-40 
PEAMA DA ick chicane seach chi wees cae ac hha eee Reet ahs BG sad ong Da ate or daah acd dasaetoe ee cdec ag eit vade eee 3-24, B-5, B-84, B-165 
PHIMS BEM inc sstyeceetcnaze tate gsct ccc dees cea suciena begs occa eateat vac caeseicusae ct sanbecanees tees tec oeeaete teguak ta staeaicee 3-24, B-5, B-86, B-165 
PhySialt:. dscnctisenwnaladl nian ianies 2-10, 2-15, 2-16, 4-5, 4-25, 6-3, 6-4, 6-18, A-4, A-6, A-7, C-7 
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PINGEER 2.g0 oni avn nine ava ee ie ye ane ate a eve ain ae 3-25, B-5, B-88, B-165 
PINT cece fate Sheet tn es At eR Ata AA tn Gg ae Lae aa 3-25, B-5, B-89, B-165 
eg A ne reer re Crecente cere ere reer Perce ree carer CEPEC ero eect ere ore reece eer erreecere 3-25, B-4, B-90, B-163 
PMADDE): a3 icneivain Levees 3-24, B-5, B-91, B-94, B-96, B-112, B-114, B-119, B-121, B-123, B-165 
PMADDUW tative tae ant hee ha ee eee Adel os eed 3-24, B-5, B-93, B-165 
PMAD DW 3.3 :cetnvinsAidteaeecee td atiers ei easfadietisiase al etn dete a ea oa nnd 3-24, B-5, B-95, B-165 
PANU A0.4 o eneereceeetceer rece earce ce er Ecetocercc renee reren erecreercrr reece reer renter cccrr ena e reece 3-24, B-4, B-97, B-164 
PMAAW (tect sndin ite Seal hanes meee ees 3-24, B-4, B-99, B-164 
PME eeccceteruaetievedeteocete iy ccc deere fbtnaveiereseeeebets eccebd vent besededecteded sca eewetl beadthaets einteetebeaceales 3-24, B-5, B-101, B-165 
PMP cricctcreyct eee Sevan hens taktetester rete s, Meue ieee ead ieee, oo ete veaeh eetteeenn ie Men shes 3-24, B-5, B-102, B-163 
1 Lt A Peeper ere cence tree are rcer eet ny ec ereeee EEE eeeEecerec ere reer creer reer eer rer 3-24, B-5, B-106, B-165 
PMIN Fe vtentianve isl iw) eae tea ee ee et Din et 3-24, B-4, B-107, B-164 
PMINW (teas ieete ct i ah et ie Ah teen eee ak as aos 3-24, B-4, B-109, B-164 
PMSUBE a: .ccensaina idle eae tlie eMedia anaes aaeeeets 3-24, B-5, B-111, B-165 
AN RS UU) 5). Peneerereeecer er rercrcercrecmer beeper cccerer eter bert cccereceere itr cere ere teeccrerenectrerrercct terertrrerrcrcct 3-24, B-5, B-113, B-165 
PMT veteticcasiec ein diction nig nike tie litical iis dee ee ae 3-24, B-5, B-115, B-165 
PIM TA es ved cteertecthebepecity teed beluetl cel ye dn chtueee tt oes lebesarte Se eaatebeiet ae ed oterecdte enacts s 3-24, B-5, B-116, B-163 
PIM TAO fesse testcees het fe ia tetas, hath fot eh teltee ae So ed athe Se es Meh 3-24, B-5, B-117, B-165 
a LOUIE oa Peeeeeeneeree cereccere ri reereceecreererncer cnc reece ariere Cereeeerereercererrercen ae ee recreates 3-24, B-5, B-118, B-165 
PMULTUW 2 sseaitia eiaes ton ve aac anee ya ieee eee ais ee eigen 3-24, B-5, B-120, B-165 
pa LO IEd b's deereeeccreceee eeteencereorr eco reer tere: reer cirec Pees cee cence reece ececrereercete cere ereeeree 3-24, B-5, B-122, B-165 
PNORivscccisd cevceeietecnit ad tatesbeceesedtativat iv iada sae oaensai ede teal teachers testes rebedeenets 3-25, B-4, B-124, B-165 
OU S sessed tebe ae afb gt A oc ite oa! a atawrc Seas a aslatie ac haat aad galetnls Sods cat pideat nad caeaves pedustl. Coneed siecle nininest tai seater: 4-9, A-92 
POR siicniiaia side dedi ele ee ita Sa ease ee ee eid 3-25, B-4, B-125, B-165 
PPA CS eects tubeetevzevevecety regs ts Beezetebanvehcvevertbebe testa ela rensbelvauatebedy ens deere baie at eevtrete beetle 3-25, B-5, B-126, B-164 
PRAGE chee thesev coat Sevan cue teennaeltetereeien ees tog setae ee sce te ec iae, efit eh thes Tet aneine ey, Meta eed arttcs 3-25, B-5, B-128, B-164 
fed ed a Ol ms Reece enor PE CeRCeE Crea Teer Erect ene etree erence EE rere err eer Serre ea ereeree ere rortee 3-25, B-5, B-129, B-164 
PPACW 2. dt endiatv Rw oie cil dint oe ee ei ed ee 3-25, B-5, B-130, B-164 
PICCISO Mi tici a eciis unio atch ieee leads titers caeke tive shesniith slaaiech aenkeesivtunieh sanites Magaed cankeeeticcvpsheanieah thins vad 9-4 
POC CUO INS sos ca dase tec eet eaai a ra eg dese ceg a wedded saddens sa Sven adagy Cac daca ead cee ce dees adansdetdaceary. ehegsaheats 1-2, 2-3, 4-23, 9-7 
PYOCICUOMiczstescsates te bicnns aces Lents Soocat tepsulin ds boveteana Soi cat Sapsulin a peaie: Seats finua ood cat Segdelin ca iaues Se0h enneds lemueOeiaag a auin aegis 4-23 
PREP wali aiiaig ti cidita nic nhc egies: 3-19, 4-23, A-2, A-91, A-141, B-163, C-41, D-40 
PI OTOTCI 2c cleat cee beweeetiteae sated weil Shs ceeenc peewee vetitawe dE te ecto vt dveeetaaead big vance der eena na eepebetuarstebeaeet bepatbes 5-19, A-91, A-92 
Prefetliacsztc teaver belade tee oak eth ep Lets tie eer tee 1-1, 1-2, 2-11, 2-17, 3-19, 8-8, 9-7, A-7, A-92 
POLK sees ce ear aad ace ee tet seed cok ee ths tat cet l eee hes t ako aaeh ued tet tian wala ita Seaaseeunas eengeetatces 8-3 
PREVA tice ae Li eee in) ee a ek ee en es 3-25, B-5, B-131, B-165 
PRI Gist ener site Aine hath AA ERO Aha lol ttn Gg A ae in a eee a ae i a 2-15, 4-5, 4-22 
PIMONIGS os 5 ace aa ie eau Bah culating uate nda tacamhy a vaaachia te cbagpabacen vphaceotendeoetyy scckens i eandec aval Rye lie tay cotta 12-7 
PONIVIN SCS 2c bade Sey sa bed oO ata ndet steeds ea van bees bege aa sdee Shesg iabedcas a bede ck cdobe cease bode wt cas paceese bode Mc iota inetetaaa eee tia 9-5, 9-11, C-8 
PIIVIICGS MODS hos sccseceteceeehe tata A viecec dit geesneedgacesdaadanadetavts do ataauaataanev dv vandal daadha Sac gecavie beadeb oa edi ddcatnea date 9-5, 9-11 
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PHODC- 24, ego ye ees gee eee a vay Leng aa a a eee viv en oo 3-20, 4-6, 4-14, 5-17, 6-20 
PRO TSW ils fentdetitertt hate hare Ee et ae al aa 3-25, B-5, B-132, B-165 
SCLC Osea cnccc akc caet cate ca atest ees s eee eae ean Tee ate aaa ta eae aan eae ot aoa eden aa ea aea cee aintiee Ta SaA Tae eaaleo uaa TAT ae ease cae 2-15, 4-5 
PSeudoCOde:. 2.408 aes eee ae ae ee eee A-1, A-2, A-3, A-4, A-6, A-8, B-2, D-2 
PSCUGOCOGC sis isdeisicdecds fassaea dann Fesc eas ebadh ahaa kde GBR ha SSR ae Sa aa aa ed vs A-3, A-4, A-6, B-2, D-2 
oc] eg epee rece ereer ec ete freer peer reer erect terete ere eree er rere rereete rer crceeereperere reir er reece eee 3-25, B-4, B-133, B-163 
eee) VA  Sereerecoreeeee nee reed here ees cece Cah rere ree batrcgc reetha ren ereerccet he recteceeeter 3-25, B-4, B-134, B-165 
PSELW weidntad einen ianiana Rape hhiien a ahi s 3-25, B-4, B-135, B-163 
PSRA Hecccsteteaeticvzerecer tithe pate balvel nis teteetedy oie yer ivehetrsluctebelc Sout beedhaeeteinteete braces 3-25, B-4, B-136, B-163 
PS RAV WG secre cac eect lesan sevlucdpechecs dec sirves oy aced0u, a fen Mee Dea see tees we, dei, ep Beveatagl ai eet vate beh ee Maas. 3-25, B-4, B-137, B-165 
mee Lala A ect cee carrrceree reo Tore cerce rece reecerce Ecce ECE ene eee eee eer eerereeree ee rreeeee ree cere 3-25, B-4, B-138, B-163 
PSREH ie fit entianve nisi iii) 2 ei aatiniet ite 0. ne et an ee tt 3-25, B-4, B-139, B-163 
PSREVW pat chutes i Anh ata Ae tei eee ten Ws tae 3-25, B-4, B-140, B-165 
PSREW):. ao. inte sncaai aaa na aes eal iden eee ee 3-25, B-4, B-141, B-163 
clu 5) = penrrenaere reece Pernt occrrcerrmer ber occerer eercrecetrcnateceecerre etereercrcrec erent orcecer rere coca 3-24, B-3, B-142, B-164 
PSUBHir. ication Aetna labia dee eel aed date iain eee 3-24, B-3, B-143, B-164 
PSUBSB tishedestecthetepestraseescbilvesd ch ued Pssevstitv Teal ghee Sa vtagetdittael aye ghee es 3-24, B-3, B-144, B-164 
PSUBSP sic tesc seis hui ft ee ie fests aos Se ete ee ae St ee es Met Aes ae 3-24, B-3, B-148, B-164 
ee OB ha) h ernen near cence ECE ere rE ePrcE CCE Tr ore cree eerrer perce enrerceeereDnee err Ecce rer cere reer cee ceeere 3-24, B-3, B-150, B-164 
PSUBUB 22.cicnesten ina raediag Me ap ieee yay ees Se ages eee eigen gad 3-24, B-3, B-152, B-164 
PSUB WH patria ttete tiene Ate A ee An Se ete oh A Ae ee ta Aa a A a 3-24, B-3, B-155, B-164 
PSUBUW sicrcdecedetecnie sdcastesheceeacl cogcseevans pacuadeustesiace oa Aceacstevatesiataadeahesbadeee Angeatieabaaiatees 3-24, B-3, B-157, B-164 
PSU BW. ct ietisteciet sotsaes taste ras ited sstga tants Sodectiresesdest ass sguents Oiae ecatbeieesapi dest asteescterorae ak teneuats 3-24, B-3, B-159, B-164 
PT aglho.eidiatiside ts Maite ee Genel eativs MEe ie ee ee ei ee ie ed ee eee 4-31, 4-32 
PTI B acettoee ceteieetevzevavecete begs) Raavetebaayehcveivdbebe teers varia ve nebedee hedtegy liebe eet ts eevtiehy beac canivents chy eitbbaan: 2-15, 4-5, 4-9 
PIEBASG eit ke tes tedecciectestisaveiscie, the adncdnwcnae leary ascee, tay adda aabaieetiereyeaeeie, dee, ate etian ceeding at ieee anys 4-9 
ODO kore eee eee eee EPC PERO Teer CE TERR eee CERT ee Eee TEE EEPEC RTE rer rere cme eee eee cerca ey 4-9 
PXO Riva. c hist eniaty dd onaveeP eelentinly ee Saat lainey ed nese a ei 3-25, B-4, B-160, B-165 
Q 

QESRM eescteShesteciesceagete eedvd neal thetee ene hepethheeshacdnnyl eens deans betteladee 3-25, B-5, B-20, B-21, B-22, B-161, B-164 
ONeUIN Sas teitee eccees, etek ah fecha eect yeahh tet cae seh Sa ae Meet ed eee Se ee teeta Be cee sta 11-6 
Quadwordins:.a4) niin dhiiie ete iiedetid iat ng tbaededthee 1-2, 3-5, 3-8, 3-10, 3-12, 3-25, 8-9, B-4, B-5 
QUADW ORD 2d sesreceegetesesdved vet lthres ena petyhean th bearer el aeaeenengbtesi edna arly A-7, B-10, B-162 
QUINtIBYIG? .t4cccit iene ait ti ee ite eae Lane ee i ne ee a oe hie eae 3-10, 3-12 
(o [00 1 (=0 Peer oo neere eePrre cee nent eerer eee cert? er cernceee rer erc errr eer rer ere eenr rece errr cocererereererer ccercee 4-4, A-38, A-40, B-7, B-9 
R 

RIQ000 eetits sais wea ends hla inated eee Men ee ee lei ee ek. 1-3 
PRAQOO ec cetebvadtielics vga eda eave advetetieleallely ccubelvlyccuntys lenebbedder dtl vyeptybvay Gaede iarebedy ecdetayetaebbeeceb lady eltaeers 1-3, 6-2 
PAMIGOINIS sosan.aetitancacecbectest a baagebanebeduat aaaheduoe abe gnet sae wedevsceasuseava biceat cauuanbanenedusd daaaasaneaadbestauewemsaaneiecas 2-15, 4-5, 4-11, 6-2 
Random iiasntvsiceaniiinddecnhinvadicndiieadts 2-15, 3-20, 4-5, 4-7, 4-11, 4-14, 5-11, 5-16, 5-17, 6-20, C-40 


X-17 


TX 
TOSHIBA Index im RISC” 


RandOM DS: 4... ce.aie ain Lelie ea ey en eee en ee ee a ae ee eee ee C-40 

Refill... 2-3, 2-17, 4-12, 4-14, 5-2, 5-7, 5-9, 5-16, 8-8, A-56, A-57, A-58, A-62, A-66, A-67, A-68, 
A-70, A-74, A-78, A-79, A-93, A-94, A-98, A-102, A-103, A-116, A-120, A-124, B-10, B-162, 
C-7, C-8, D-26, D-37 


REGIMMieccurtintasies CAA Ai tt aia Ge oe 5-22, A-141, A-142, B-163, C-41, D-40 
FOGISLOR sc cahe Catia sens ee dedeher a ddeusntect tazscvadeti tes nage,shadaambachsgeuuarsnadacecdeaaunitabaatetats 10-2, 10-6, 11-2, 11-3, 11-8, 11-9 
Register............. 2-5, 2-6, 2-8, 2-15, 3-14, 3-15, 3-17, 3-20, 3-25, 4-3, 4-4, 4-5, 4-6, 4-7, 4-8, 4-9, 4-10, 4-11, 


4-12, 4-13, 4-14, 4-15, 4-16, 4-17, 4-18, 4-19, 4-21, 4-22, 4-23, 4-25, 4-26, 4-27, 4-28, 
4-29, 4-30, 4-32, 4-33, 5-8, 6-9, 6-10, 6-12, 6-16, 8-25, 9-2, 9-3, 9-4, 9-10, 10-7, 10-8, 10- 
9, 13-2, 13-3, 13-4, 13-5, 13-7, 13-8, 13-9, A-3, A-4, A-5, A-9, A-54, B-3, B-5, B-161 


TOCISTONS cess ites fee at pene sh eae ae ce G oe a Lh ce SSA ee EU a a acca ea Nea eles Soe tak Sate pu cate meer tae weee te Mee ee gait, 10-4 
Registers....... 2-1, 2-3, 2-14, 2-15, 3-17, 4-1, 4-2, 4-3, 4-4, 4-5, 4-8, 4-26, 4-28, 4-31, 6-14, 9-2, 9-3, 9-4, 13-3 
ld Beene epee rere eer ete errr oer reer Peer ce epee cee emerre or eceer eee cero erereane er correct eee ee cere eeree 8-11, 8-14, 8-15 
REQUeSt Ariat sectvi actin Sees ei eesti ei en nee teeth ies ee eet 9-9 
PRES ices eva biceat cavawiex te acccus danSudenca bocca aanscaaeaa aeeues uedabaca sauces te ovacec Rav ou dana oecantaenau cease pendes deadtan deatguanne soaase taanentance 4-19, 5-8 
ROS@liiieic isa sinididnic asinine 4-18, 4-19, 5-1, 5-2, 5-7, 5-8, 5-9, 5-10, 5-11, 8-11, 9-4, 12-6, 13-14 
RESET feed steveteDeetepesttanees evegl oh ude leeloh endldheametaiaee nthe aa edd binder a 5-11, 5-12, 8-11, 8-14 
Fl ocde sf Neste est cee oth Ped teste ees Noe see Lt ed tate Me be 2-16, 4-20, 5-8, 5-22, 6-1 
PROOU fas acs th cael ta cite aetcass hea caanets Gate beed ee he ete t hase a ceed ao hh edt ae olscade esauagtaseadihsucsuuiianaantse 3-21 
FROULS ris csa eeu deve cveleguissiuevla cla dotnigh ddyutails db na site daudnnaeongs Vago dad gotehulsslgadeevgeruediddyeladh devesdsdugaawed dudeiaevaseucdaaden 3-25, B-5 
ROUND: bes sencterecetis cctv Si eae ate ea a A, Aad a cin aa ce at kale D-32 
ROUND stints. cages sears ahsevs dace A cde iy a cegse cde tes hace cas heea cee ee hachads eden bev baad vies deans faadeeebescevasacasdth 3-21, 10-14, D-41 
PROUND IW sersiiect est steers ta yet rie sidect calc havea odes ela ects os uteetaass bide aks sabe skate nhavan aa Sbetearenigeitrsantae i ne upeeseecaes cetaptaatresea eae D-33 
ROUND W.fmtinecae i dart eal ee nieve eet ee ete: 3-21, 10-14, D-41 
RS OR Tees teceaceieetlearotetebeen Gade heteteaetbevins tegeds ate petehnas Heveasedbebaty enc ebeevitebnis paaeeletvauebeleavetel setiveeibraneanet. 2-18, 3-26 
Ss 

OO) ae ee aa dees tas a Ate aden a eed eee aad eee 4-29, 9-2, 9-5, 9-11 
SS ieee sh Seas cocean te apateg te deceave atten eataceas sateen oa Gucgan feete fasta giteaetecaetaa e becpat vee steateragutents Seagasciatene manesest tetamk te 4-29, 9-5, 9-11 
SaiGn hh ieee: 3-3, A-41, A-42, A-44, A-45, A-47, A-48, A-104, A-110, A-112, B-133, B-135, B-136, B-138, 

B-139, B-141 

$Y eer ERE porter rece cree 2-3, 2-11, 2-12, 2-13, 2-14, 3-25, 4-1, 4-2, 4-3, 4-4, B-17, B-20, B-21, B-22, B-161 
Saturate 0.2... eeeeeeeeeeeeeeeees B-34, B-36, B-38, B-41, B-43, B-45, B-147, B-149, B-151, B-154, B-156, B-158 
Saturation ......0cceeeee B-3, B-31, B-35, B-37, B-39, B-42, B-44, B-144, B-148, B-150, B-152, B-155, B-157 
SAU FAT OM att thesccrk te hatade te vsech ade teenie dtvererk te hates tach Austen dew aeoice Aa duatalevasnace Movad seweaaads dated fetes taeonanad Monaat deanna 3-24, B-3 
fo ene rrERE ET nee CEEPE CREE CEPR CEST PEE eEPECCER er cer cir ceer reer ee eeeerer reret er ereer 3-4, A-93, A-141, B-163, C-41, D-40 
SG eile ial vee ee i ee a 1-2, 3-4, A-142, B-165, C-42, D-41 
RSG] Benepe Pore ree eee CeCe CEL CCE eer eee crr Peer eee ee rer err cee eee 1-2, 3-4, A-142, B-165, C-42, D-41 
Desir aes eeetats at eata stad ebei ag di tegt le cade atye techie daeate vac etageeees 3-4, 13-8, A-5, A-94, A-141, B-163, C-41, D-40 
6S) BOF Neneererre Pree cere ecrceene fcehe crererecenecere eceee eee eet ocercc re rery 3-5, 3-21, 10-13, A-141, B-163, C-41, D-34, D-40 
SDs. shaiente hale adden Siete dae 3-4, 3-8, A-95, A-96, A-99, A-141, B-163, C-41, D-40 
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SDR trate iia iis ee ag eae ee ea ees 3-4, 3-8, A-95, A-99, A-100, A-141, B-163, C-41, D-40 
SOQMONts sve ise Ada Ra ee a i Se Ais 2-16, 4-9, 6-1, 6-8, 6-9, 13-9 
OCI SM beeches seceded sceeek satecewestdale ches stoceda lla stvenstdale hts teaeiela aeatet aa ceusatsatesth sagdeuialaasdtet sauseniationweetaaett tice 6-9, 6-10, 6-12 
SeMAaph Ore. ieee Lage A ee ee Ee ee ee a 3-4 
SEPUD Vie rests tates teat ahaa Ain Se as eet Ain ce ca hai at ale 3-10, 3-12 
SEMAN ZAC OM sec ees sees sage tia leeches ccda cee cddegeeat Japecvnddeetasonddaetps Ou cndfiad castes uase saab cevaed ee usge ad ewtvn auehs lade aebaasuneteatecess 3-19 
SOX VLC weeas 25 because bade t ha adet adbScaatzadue dads Labecd raaboda st zadgedawdd sods ots id beth sua bbdevt dalondavea iabdct saa bodavacne ch aaa vee ae tasbecers 3-10, 3-12 
SH evden e Briana anid Mesh ene eee 3-4, A-103, A-141, B-102, B-163, C-41, D-40 
Siftsacenetbeeeteutehesad et ales ede aivenebedyeed 2-3, 2-11, 3-14, 3-15, 3-25, 3-26, 4-2, 4-4, B-4, B-5 
AS] ST 1=) prsnee eer res eeceee pein ere cn rcemeree ce ce Peen ere eer cer prs terre em eee preeeer rent cpio hee cerrr ne cee rn reer rete n er ee err meer em emer 2-3 
=U [o)\ a Reeecerenecee pn ee eercch ceeteen ee rreee eee nee Peer eer erie reer cc eeerne rece sneer renee a eeeenecr rere ches ones Perr ieee creer peerre eerie 6-2 
SIQN devel: 2-7, 2-9, 2-16, 3-4, 3-16, 3-17, 6-1, 6-3, 10-10, 10-11, 10-12, 13-8, A-11, A-12, A-13, A-14, 


A-17, A-18, A-19, A-20, A-21, A-22, A-23, A-24, A-25, A-26, A-27, A-28, A-29, A-30, A-31, 
A-32, A-35, A-36, A-38, A-39, A-40, A-44, A-45, A-46, A-56, A-57, A-58, A-60, A-64, A-67, 
A-68, A-69, A-70, A-71, A-72, A-74, A-75, A-76, A-78, A-79, A-86, A-87, A-92, A-93, A-94, 
A-96, A-99, A-100, A-103, A-104, A-105, A-107, A-108, A-110, A-111, A-112, A-113, A-114, 
A-115, A-116, A-117, A-118, A-121, A-122, A-128, A-130, A-131, A-134, A-135, A-138, 
B-7, B-9, B-10, B-11, B-12, B-13, B-14, B-23, B-24, B-25, B-26, B-68, B-70, B-93, B-95, 
B-113, B-120, B-122, B-136, B-137, B-138, B-140, B-162, C-2, C-3, C-4, C-5, C-6, D-2, 
D-14, D-27, D-31 


sign_extend.......... A-11, A-12, A-13, A-14, A-17, A-18, A-19, A-20, A-21, A-22, A-23, A-24, A-25, A-26, A-27, 
A-28, A-29, A-30, A-31, A-32, A-35, A-36, A-38, A-40, A-56, A-57, A-58, A-60, A-64, A-67, 
A-68, A-69, A-70, A-72, A-76, A-79, A-92, A-93, A-94, A-96, A-100, A-103, A-104, A-105, 
A-107, A-108, A-110, A-111, A-112, A-113, A-114, A-115, A-116, A-118, A-122, A-128, 
A-130, A-131, A-134, A-135, A-138, B-10, B-162, C-2, C-3, C-4, C-5, D-14, D-27 

SIG Mall satasdehks Jearees tdandakbaseeteeybaahcdehesldowwes buatateestaanadenhusesies hanna cetause dees buat cded stueees suunsahbaveedeuiseadwes bands cehasteees 8-3, 8-7, A-8 

SignalException... A-8, A-11, A-12, A-33, A-34, A-35, A-50, A-58, A-67, A-68, A-70, A-79, A-94, A-103, A-114, 
A-116, A-126, A-127, A-128, A-129, A-130, A-131, A-132, A-133, A-134, A-135, A-136, 
A-137, A-138 


SIO) sgiastshoreaibaitesincell aancbierts 4-17, 4-18, 4-19, 4-33, 5-2, 5-5, 5-7, 5-8, 5-9, 5-10, 5-25, 8-10, 12-6, 13-8, C-14 
SIOINT.chasic Gai ciiieith Atl et aba aud i ied el ae Gee eee 8-10 
SIOP 2 vsauethghicayttevtaeeeegitesehs eel ellie ea eat eteeee vl aid cheese beeper vith cartetlite et a ceed ieee cael oaeptaeets 4-19, 5-25 
| ered oper err Pere ore arr meron P er 12-10, 12-11, 12-12, 12-13, 12-14, 12-15, 12-16, 12-17, 12-18, 12-19, 12-20 
SS El eae ERE ee ae rre EePre ree eereT rar eerrer eerecrae ei eee trere teeter ere een eter eee rreeree eres 3-15, A-74, A-78, A-104, A-141 
SEL) salient, ns Lee ee 3-15, A-74, A-78, A-105, A-141 
SLE Tiake Sut eevee eee A ve eh ee et lee Stn oe ae 3-15, A-82, A-83, A-106, A-141 
SLT ee aaa eters tas ee teh chiara te vaadeeent iad acest 3-14, A-82, A-83, A-107, A-141, B-163, C-41, D-40 
SHETWWes 2rsite wee findees bate Aaciatetstess Rea sacveivedbed tau ceaasiaediasteeey 3-14, A-82, A-83, A-108, A-141, B-163, C-41, D-40 
SET wiciuistiata sieved sade nei ie di ee ea eee 3-15, A-82, A-83, A-109, A-141 


TX 
TOSHIBA Index Ni RISC” 


SEW recep at tas ea Sng eed va uae ee a ee ve eae a noe eal eee B-102 
SNOOPING: Ses intas aa At a ath tata ate lee atae tit ai Ate ita 2-17 
6) dm]  eneer Pr ee Cncer EERE ee eec PEPE eree cree rerer retina Pearce retrace 5-22, A-9, A-141, B-163, C-41, D-40 
SQi sieges Lai ee ee ee 3-5, 3-25, 13-8, A-141, B-4, B-162, B-163, C-41, D-40 
SORT eseviititech viene Ris ein itt adda tee ane ae a ian ahve 2-18, 3-26, D-35 
SOR AMM ac daeeceatvandeneleetesscve 3 Ascdeears ae snatea ean hac OA aoa aesegs radar iags eater tee 3-21, 10-14, D-41 
pio [WE-\ (= reer eere reer ere eepe rea creer erry ceed treater ee eeepc er cere eee been eerecr cere beeoecercreenereer cece eececa rer eecreeeeree erences 3-21 
Square ROOtes:cinnatesenkiniinc ennai easels Weep ee D-35 
Payee cheveapeteevebetiss htaetede ei eye Ate veeeaes tgeetedy ee weet bee redler ace ee Senet abel OO wen babel ates 1-5, 4-16 
SOPRA cise cscs atresia Nae cette eae see ng oy aay na nbn Pade Bed eves tag oe ewe h oat ove eee Aree goin see 3-15, A-110, A-141 
TON CEE ee TEETER CEE Cer CEEEPE CREE EEE RRR CREE Cet na eC rer eae erie eT eeorreeeer ee eee etree cer 3-15, A-111, A-141 
SRL vivian id ia iv ee ee ee ee 3-15, A-112, A-141 
SREV wert gis canteen Aah aia as Ae te Aen ee at ah ee a ae 3-15, A-113, A-141 
SS OG cass esse gags vaseaeos Saga gees sedate cheats Pan asia a age de cada sn gag Ca ees aed ee da bev ds sa esas ee ca ey 6-7, 6-10 
£S 1 (= ECAP ERET EE EPREEECOEEEREECEECECEECE EERE CERPEETEEPECECEECERPEETEET CORE CCERPEET ERE CERCRECEEREPrECEREMPEPE PEE ECR ETCECH TEE CCRC rrnTrry Ceererere, 6-6, 9-4 
Status... 1-5, 2-15, 3-5, 3-20, 3-21, 4-5, 4-16, 4-17, 4-18, 4-21, 4-25, 4-29, 5-2, 5-5, 5-7, 5-9, 5-11, 


5-12, 5-13, 5-14, 5-16, 5-19, 5-23, 5-24, 5-25, 6-2, 6-6, 6-8, 6-9, 6-10, 6-11, 6-12, 6-13, 
8-25, 10-2, 10-4, 10-7, 10-8, 10-9, 11-2, 11-8, 11-9, 12-3, 12-4, 13-4, C-1, C-7, C-9, C-13, 
C-14, C-15, C-16 


STAVTUS ticki ep tien be ie ae El ees ee eee 9-2, 9-10, 9-11, 12-6, 13-5, 13-6 
STONING etentres tet ealetys Als RO een BA Sa ails SA Eta ia A hata Gala A tate i a a 2-6, 4-31 
SLSSMING BIS vase asics aac Seahecasnck daseegavsk cases adieasaeakegesiscl edeashiech sensei ag tastes bacaasbaeaas ey aide cates cafa,¥efeithus ceva tebcseupeeinestuetes C-10 
SUSI PUN Gas sacar oa eeteer once ahs acne ch vadeh dean tad Sura atls pagel sabgst rae shade tadoe taut ondadt slab teest aetaeteaerecs 1-2, 9-8, 9-10, B-20, B-21, B-22 
StoreFPR ............. D-2, D-4, D-5, D-12, D-13, D-16, D-17, D-18, D-19, D-20, D-23, D-24, D-28, D-30, D-31, 
D-32, D-33, D-35, D-36, D-38, D-39 
°S) Co] 111 (21 0010) 6 ren A-7, A-93, A-94, A-96, A-100, A-103, A-116, A-118, A-122, B-162 
FS EEE ere rE CATERED CERES EP ER LORE TRrereereeT erence eee errs 2-18, 3-15, 5-26, A-114, A-141, D-36 
SUB AME teste ctut aes eine edad dined elena al eee 3-21, 10-14, D-41 
SUBOUTING sf .vts vue eth elias aa ea ain ae anti iin ue eee ee tn ea 3-17 
bo {U] o<{=\0 [= 19) ener por erre py hee Err Seen peer er tia iy reer esr cier pier ert secre py fer trr aac uerr preter er te Serre py ier fr ee rerr esc crrr ery teas trey 2-4, 6-17 
SS1U |e) i ¢cle|Peeeererteeerceperrrreree crete recere creeper oecerece ree pa coccercr ereece ee raeecerreecrrer rereercrcrecceer 3-15, 3-21, 3-24, B-3, B-5 
SUBU )issinsisaekeiihiel dete ni eel nae i ee ee eee 3-15, A-114, A-115, A-141 
SUPGRVISOM a. 2acebec Dis eeseaetacetecbes ebapecty ie ebeseaaptadeeesbeaeekcaiicegdaeebiecvectienteeehd 4-18, 5-15, 6-10, 6-12, 9-11, 13-5, 13-14 
Supervisor............ 2-16, 2-19, 4-17, 4-18, 4-29, 5-2, 5-15, 5-22, 5-23, 6-6, 6-7, 6-10, 6-12, 9-2, 13-5, 13-6, 
C-1, C-14, C-15 
SUPERVISOR ‘evap signs Saaryae igen ava gle ee sk a vac eae agp ee dap aes decease eae che eae vay ae 9-5 
SUSCO ui Aine A in ee Ae ee A hn en oe 6-7, 6-10 
VV slis NA Pate a iata a hd cet vata Bata eseautie cteae ete SeasNii cedeaee ty anata abititee tas 3-4, A-5, A-116, A-141, B-163, C-41, D-40 
p(n ecetrere eereerecee ere ceerer cet crcreeecree ecerererceeerrere ore 3-5, 3-21, 10-13, 13-2, A-141, B-163, C-41, D-37, D-40 
SW G2) sunieeeadiieas anand iva ee ei ie ea ieee pees A-142, B-165, C-42, D-41 
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SW litte easel ean Lani Se eee 3-4, 3-8, A-117, A-118, A-121, A-141, B-163, C-41, D-40 
SWRit sata eee ieee Ah aos eo ete ota 3-4, 3-8, A-117, A-121, A-122, A-141, B-163, C-41, D-40 
SYNC... eect 2-11, 2-12, 2-13, 3-19, 5-24, 6-17, 13-9, 13-16, 13-18, 13-20, A-125, A-141, C-13, C-27, 
C-28, C-29, C-30, C-31, C-32, C-33, C-34, C-35, C-36, C-38, C-39, C-40 

SYNGCHTONIZAUNO Mavis asectes. bdeacive leeds cahdevealascebavese tide lanbdevesabents ea bdsee seein aletea ay bide aaa eae 2-11, 3-19 
DY Sic fears bevecs Acacias eat sanedie leeches usvect uscd dusteck satevvactaecs naguenpetucnsthad rant sade cachdeouieey hesgstisdauatvadeeetiiada arse 4-20, 5-8, 5-20 
OY, Sagat es eaeataes aeesebaae Maleate eae tas Melee rae siete a tenyc Seas atau tara bhgast cad sotaate Sodas tagiresinpect ewidask sstgaeaatie pega stan 8-3 
SYSAACK ssnhvinnsieniinnnie 8-3, 8-9, 8-12, 8-13, 8-14, 8-16, 8-19, 8-22, 8-25, 8-26, 8-27, 8-28, 8-29 
SYSADD Revseveh cess tesaeteie ee vent tebede th ataereogets Behe eqeetebeartieed eeateebeld cevteeeeyt he beastie eedtteelealas eer ett ics candy’ 8-3, 8-7 
SY OAST ARTs cscs reeecsin ccithttevei ead Senne sten adties tee eden Interac even: 8-3, 8-7, 8-9, 8-12, 8-13, 8-16, 8-19 
FS) eo] | mrp eee REECE CEEPECRE EEC ECR CE CEE nec eee TeEC Pere Teen epee eer cere reer ereeeeree 8-3, 8-7 
SYSCall in ech ncdeave hee vecceieeteaed eds Wak ice teawa cee cach Mievaaa dd dela dev each iogeaaidaieti deci qavduadu jetveadh igeaydavvan deeded 4-20, 5-2, 5-8, 5-9 
SYSGALLSavtiten aetna Aiea a a te 2-11, 3-18, 4-4, 5-10, 5-20, 9-7, 9-8, A-126, A-141 
SYSDACK.......ecsccteeeeeeeeeeees 8-3, 8-10, 8-12, 8-13, 8-16, 8-17, 8-19, 8-20, 8-22, 8-25, 8-26, 8-27, 8-28, A-125 
BS fo] BYP. teeereeeerceper tr Pere rece Perper erereeer er recrerceereer rte pececccercrr herr otcee creer eeerte cree rr ceeceee 8-3, 8-6, 8-7, 8-9, 8-16, 8-17 
SYSDSTARY 4st cide vat a te 8-3, 8-10, 8-12, 8-13, 8-16, 8-17, 8-19, 8-20, 8-25 
SYSRDiseshgeteazeretiyeahegheatea ae pacdecdeatepectt fe belay ohh wade hs weketet elt ees ee ebb hngeteand abe eeut chk siete eeld aeeeasten eee eethe ates 8-3 
DVO POI ZE see series at a teenie ei te eee is ee ae 8-3, 8-9, 8-12, 8-13, 8-16, 8-19 
OY VER eet tine ert eee acteas cece aus ee asta cutee catene atatn gence ttet sete aiets cet ts uta cse en tear ae su coh wae geet deen gg anes 8-3 
T 

TaGinvkteetisivkupninewhiehintiaiaiead hein hatiiwhigeiiee: 2-6, 2-7, 2-15, 4-5, C-9, C-11, C-12, C-13 
TAGs ets celeb eiveheescacttedelncet Bev tebeavthestee vet be aeen travel eaelettele ly era dle titel ae, viatephalae nanny cibbiveniheee sieteaeere C-6 
Mite) nlepeeen eect nrc ore eee ereneecre arc rece errr reeecrceerbcrerrcrec tere rierrrccert reer 2-15, 4-5, 4-31, 4-32 
Tag A lissasttiatvkeg atin i eis ee en neath Bee heed ie ee ate C-10, C-11 
TAGLO fete setoetnice toes vevtevteines ts euautebeueettene legen Beart hayeheh ertheleivetthelvereeelniyehaneina nertecnitalt 2-15, 4-5, 4-31, 4-32 
i i: 1o)| XG jee eeeepreetepereeetceeenet rec eer ence p ore eeenerrenr reer crac n seer creer cree reeset eer retreec re ererr errree reer C-9, C-10, C-11, C-12 
1G 0 eRe Oe cere nr POSE ree preeeP er orc eee cre on errerePcE ree che reeceree rere re er EER Tere errr ener reece rece rere erry 4-31, C-9, C-12 
TargetAddressiy wicaesiecieeva dh elected ieee ie advice let noe el le eh athe eee veal ead C-10, C-11 
TEQ Gatitaais auitie eth aaeiie ceo ee atte aa Aalto asin atte 3-18, 5-27, 9-8, A-127, A-141 
a = © | esereer epee rior cer rege hoped rer ere wery sree cr feces Pyee reer rye reece eer ferete, Reever Cr eert reat rrerr fart 3-18, 5-27, 9-8, A-128, A-142 
EC Sereerccrr ener creer ter rer eer cveceeece rereercr Rene rec reece eeerrecertcerc ereereeee rcrrcerereererarerieecrececeererre 3-18, 5-27, A-129, A-141 
TIGE lnestes ane neh dg nities a aie on Ae ta a Mies ete eee 3-18, 5-27, A-130, A-142 
PGE Wrassehsbivar ties tee eetetede Saeed wilh steven teeth olathe ents ecbhareel on elgblaeraadtyaed bhlagl tee 3-18, 5-27, A-131, A-142 
TG sas tcsies teceeat tes eae Mee ae aes a Sate ieee Aad tes eS ete 3-18, 5-27, A-132, A-141 
LUNa =| gereerenee eeepc cre ceree eter cet eereeecrreeeeer rere ee eee erorcr re reererr cern reer erereerr rere certrereerae merece re 4-13, 4-15, 4-16 
TRB asset 1-2, 2-3, 2-6, 2-7, 2-15, 2-16, 3-20, 4-5, 4-6, 4-7, 4-8, 4-9, 4-10, 4-11, 4-12, 4-14, 4-17, 


4-20, 4-29, 5-2, 5-7, 5-8, 5-9, 5-10, 5-11, 5-12, 5-16, 5-17, 5-18, 6-1, 6-2, 6-3, 6-4, 6-7, 

6-8, 6-9, 6-12, 6-14, 6-15, 6-16, 6-17, 6-18, 6-19, 6-20, 12-6, A-6, A-56, A-57, A-58, A-62, 

A-66, A-67, A-68, A-70, A-74, A-78, A-79, A-92, A-93, A-94, A-98, A-102, A-103, A-116, 

A-120, A-124, B-10, B-162, C-6, C-7, C-8, C-28, C-37, C-38, C-39, C-40, D-26, D-37 
X-21 
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TEBEMGries acces fap eccees govt cock age vapestinn gegen op eae ag ee vac qaaee day eee qa Shep qed depen de ea eee eee C-37 
PE B bncesteiyeeetieit Atta a bet A ay Sha Cee te kat Ate 4-8, 4-20, 5-8, 5-16, 5-17 
RULES Feces ek Fiche ea ates ee eet tase ceee ase gtaee needa eee tte 3-20, 4-6, 5-17, 5-18, 6-2, 6-20, C-37, C-42 
TEBRveosiact celine La ae Da a 2-13, 3-20, 4-6, 6-20, C-38, C-42 
PLB S areca teeter Ahn Ry, bee A tn oes a ame la  ata Bier ketal Bites 4-8, 4-20, 5-8, 5-16, 5-17 
SLEW Lace aaes ch cdecie ese sagstvs deeeees ccvestneg catecnednddeeat sinsaBAaceuestt tear. 2-13, 3-20, 4-6, 4-8, 6-20, C-28, C-38, C-39, C-42 
eee A Ue terre trecreeeece ret ceeecree renee eee tice ecrecr creepers rere tree 2-13, 3-20, 4-7, 4-8, 6-20, C-28, C-38, C-40, C-42 
TT ave eeniniasies Meee eles eee Ree ee eee 3-18, 5-27, A-133, A-141 
PLET ye ceete geubebvay eal eheteeteds oe wer De redveiesteeiveuetbbeerenayehatledeotetel yay aatlbaave nee hteehalee tay 3-18, 5-27, A-134, A-142 
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A. CPU Instruction Set Details 


This appendix provides a detailed description of the operation of each instruction. The 
instructions are listed in alphabetical order. 


Exceptions that may occur due to the execution of each instruction are listed after the 
description of each instruction. Descriptions of the immediate cause and manner of 
handling exceptions are omitted from the instruction descriptions in this appendix. 


Descriptions use a pseudocode notation explained in Section A.2. 


For an overview of the instruction set, refer to Chapter 3 of the User’s Manual. 
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A.1 Description of an Instruction 


Each instruction description contains several sections that contain spedfic information 
about the instruction. The following sections describe the contents of each section in detail. 


A.1.1. Instruction Mnemonic and Name 


The instruction mnemonic and name are printed as page headings for each page in the 
instruction description. 


A.1.2_ Instruction Encoding Picture 


The instruction word encoding is shown in pictorial form at the top of the instruction 
description. The picture shows the values of all constant fields and the opcode names for 
opcode fields in upper-case. It labels all variable fields with lower-case names that are 
used in the instruction description. Fields that contain zeroes but are not named are 
unused fields that are required to be zero. 


A.1.3. Format 


The assembler formats for the instruction and the architecture level at which the 
instruction was originally defined are shown. 


A.1.4 Purpose 
This is a very short statement of the purpose of the instruction. 
A.1.5 Description 


If a one-line symbolic description of the instruction is feasible, it will appear immediately 
to the right of the Description heading. The body of the section is a description of the 
operation of the instruction in text, tables, and figures. This description complements the 
high-level language description in the Operation section. 


A.1.6 Restrictions 


This section documents the restrictions on the instructions. Most restrictions fall in the 
category of alignment requirements for memory addresses, valid values of operands, and 
order of instructions necessary to gurantee correct execution. 


A.1.7 Operation 


This section describes the operation as pseudocode in a high-level language notation 
resembling Pascal. The purpose of this section is to describe the operation of the 
instruction clearly in a form with less ambiguity than prose. 


A.1.8 Exceptions 


This section lists the exceptions that can be caused by the operation of the instruction. It 
omits exceptions that can be caused by instruction fetch, performance counters, and 
breakpoints. It also omits exceptions that can be caused by asynchronous external events, 
eg. interrupts. Although the Bus Error exception may be caused by the operation of a load, 
store or PREF instruction this section does not list Bus Error for load, store or PREF 
instructions because the relationship between these instructions and external error 
conditions, like Bus Error is asynchronous and implementation specific. 
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A.1.9 Programming Notes, Implementation Notes 


These sections contain material that is useful for programmers and implementors 
respectively but is not necessary to describe the instruction and does not belong in the 
description sections. 


A.2 Instruction Description Notation and Functions 


The Operation sections of the instruction descriptions describe the operation performed by 
each instruction using a high-level language notation, or pseudocode. Symbols, functions, 
and structures used in the Operation sections are described here. 


A.2.1.1| Pseudocode Language Statement Execution 


Each of the high-level language statements in an operation description is executed in 
sequential order (as modified by conditional and loop constructs). 


A.2.1.2 Pseudocode Symbols 
Special symbols used in the notation are described in Table A-1. 


Table A-1. Symbols in Instruction Operation Statements 


I 
tsi etnton 
Selection of bits yen Z of bit eSnng x. 
a 
fp Mod [Two's complementmodulo, 
P| Floatingpointaivision, CS 
F< __| Two's complement less than comparison. 
p Not [ BitwiselogiclNOT CS 
fp Nor BitwiselogiclNOROOSCSCSCS 
fp Xor__[ Bitwiselogical KOR, OOS 
p And | Bitwiselogical AND. CS 
Eo ET Te 


BigEndian Big-endian made as configured at reset (0—Little, 1—Big) from core boundary signal. 
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This occurs as a prefix to operation description lines and functions as a label. It indicates 
the instruction time during which the effects of the pseudocode lines appears to occur 
(i.e., when the pseudocode is “executed”). Unless otherwise indicated, all effects of the 
current instruction appear to occur during the instruction time of the current instruction. 


No label is equivalent to a time label of “I:”. 


Sometimes effects of an instruction appear to occur either earlier or later-during the 
instruction time of another instruction. When that happens, the instruction operation is 
written in sections labeled with the instruction time, relative to the current instruction I, in 
which the effect of that pseudocode appears to occur. For example, an instruction may 
have a result that is not available until after the next instruction. Such an instruction will 
have the portion of the instruction operation description that writes the result register in a 
section labeled “I+1:”. 


The effect of pseudocode statements for the current instruction labeled “I+1:” appears to 
occur “at the same time” as the effect of pseudocode statements labeled “I:” for the 
following instruction. Within one pseudocode sequence the effects of the statements 
takes place in order. However, between sequences of statements for different 
instructions that occur “at the same time”, there is no order defined. Programs must not 
depend on a particular order of evaluation between such sections. 


The Program Counter value. During the instruction time of an instruction this is the 
address of the instruction word. The address of the instruction that occurs during the 
next instruction time is determined by assigning a value to PC during an instruction time. 
If no value is assigned to PC during instruction time by any pseudocode statement, it is 
automatically incremented by 4 before the next instruction time. A taken branch assigns 
the target address to PC during the instruction time of the instruction in the branch delay 
slot. 


PSIZE The SIZE, number of bits, of Physical address in an implementation. 


A.2.2 Definitions of Pseudocode Functions Used in 
Instruction Descriptions 


A variety of functions are used in the pseudocode employed in the instruction descriptions. 
These functions are used to make the pseudocode more readable and also to abstract 
implementation-specific behavior. These functions are defined in this section. Certain 
additional functions specific to a particular coprocessor are described at the beginning of 
the appendix for that coprocessor. 


A.2.2.1_ Coprocessor General Register Access Pseudocode Functions 


Defined coprocessors, except for COPO, have instructions to exchange words and 
doublewords and quadwords between coprocessor general registers and the rest of the 
system. What a coprocessor does with a word or doubleword supplied to it, and how a 
coprocessor supplies a word or doubleword, is defined by the coprocessor itself. The 
functions are listed in Table A-2. 
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Table A-2. Coprocessor General Register Access Functions 


COP_LW<(z, rt, memword) 
Zz: The coprocessor unit number. 
rt: Coprocessor general register specifier. 
Memword: A 32-bit word value supplied to the coprocessor. 


This is the action taken by coprocessor z when supplied with a word from memory 
during a load word operation. The action is coprocessor-specific. The typical action 
would be to store the contents of memword in coprocessor general register rt. 


COP_LD(z, rt, memdouble) 


Zz: The coprocessor unit number. 
rt: Coprocessor general register specifier. 
Memdouble: 64-bit doubleword value supplied to the coprocessor. 


This is the action taken by coprocessor z when supplied with a doubleword from 
memory during a load doubleword operation. The action is coprocessor-specific. The 
typical action would be to store the contents of memdouble in coprocessor general 
register rt. 
Dataword <— COP_SW/(z, rt) 

Zz: The coprocessor unit number. 

rt: Coprocessor general register specifier. 

Dataword: 32-bit word value. 
This defines the action taken by coprocessor z to supply a word of data during a store 


word operation. The action is coprocessor-specific. The typical action would be to 
supply the contents of low-order word in coprocessor general register rt. 


Datadouble — COP_SD(z, rt) 


Zz: The coprocessor unit number. 
rt: Coprocessor general register specifier. 
Datadouble: 64-bit doubleword value. 


This defines the action taken by coprocessor z to supply a doubleword of data during 
a store doubleword operation. The action is coprocessor-specific. The typical action 
would be to supply the contents of the doubleword coprocessor general register rt. 
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A.2.2.2 Load and Store Memory Pseudocode Functions 


Regardless of bytenumbering order (endianness), the address of a halfword, word, or 
doubleword is the smallest byte address among the bytes in the object. For a big-endian 
ordering this is the most-significant byte; for a littleendian ordering this is the least- 
significant byte. 


In the operation description pseudocode for load and store operations, the functions listed 
in Table A-3 are used to summarize the handling of virtual addresses and accessing 
physical memory. 


The size of the data item to be loaded or stored is passed in the Accessl ength field. The 
valid constant names and values are shown in Table A-4. The bytes within the addressed 
unit of memory (quadword for 128-bit processors) which are used can be determined 
directly from the AccessL ength and the four low-order bits of the address. 


Table A-3. Load and Store Functions 


(pAddr, CCA) < AddressTranslation (vAddr, lorD, LorS) 
pAddr: Physical Address. 


CCA: Cache Coherence Algorithm: the method used to access caches and 
memory and resolve the reference. 


vAddr: Virtual Address. 
lorD: Indicates whether access is for Instruction or Data. 
Lors: Indicates whether access is for Load or Store 


Translate a virtual address to a physical address and a cache coherence algorithm describing the 
mechanism used to resolve the memory reference. 


Given the virtual address vAddr, and whether the reference is to Instructions or Data (lorD), find the 
corresponding physical address (pAddr) and the cache coherence algorithm (CCA) used to resolve the 
reference. If the virtual address is in one of the unmapped address spaces the physical address and 
CCA are determined directly by the virtual address. If the virtual address is in one of the mapped 
address spaces then the TLB is used to determine the physical address and access type; if the 
required translation is not present in the TLB or the desired access is not permitted the function fails 
and an exception is taken. 


MemElem <— LoadMemory (CCA, AccessLength, pAddr, vAddr, lorD) 


MemElem: Data is returned in a fixed width with a natural alignment. The width is the 
same size as the CPU general purpose register. 


CCA: Cache Coherence Algorithm: the method used to access caches and 
memory and resolve the reference. 


AccessLength: Length, in bytes, of access. 

pAddr: Physical Address. 

vAdar: Virtual Address. 

lorD: Indicates whether access is for Instructions or Data. 
Load a value from memory. 


Uses the cache and main memory as specified in the Cache Coherence Algorithm (CCA) and the sort 
of access (lorD) to find the contents of AccessLength memory bytes starting at physical location pAddr. 
The data is returned in the fixed width naturally-aligned memory element (MemElem). The low-order 
two, three, or four bits of the address and the AccessLength indicate which of the bytes within 
MemElem needs to be given to the processor. If the memory access type of the reference is uncached 
then only the referenced bytes are read from memory ad valid within the memory element. If the access 
type is cached, and the data is not present in cache, an implementation specific size and alignment 
block of memory is read and loaded into the cache to satisfy a load reference. At a minimum, the block 
is the entire memory element. 
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StoreMemory (CCA, AccessLength, MemElem, pAddr, vAddr) 
CCA: Cache Coherence Algorithm: the method used to access caches and 
memory and resolve the reference. 
AccessLength: Length, in bytes, of access. 


MemElem: Data in the width and alignment of a memory element. The width is the 
same size as the CPU general purpose register. For a partial-memory- 
element store, only the bytes that will be stored must be valid. 


pAddr: Physical Address. 
vAddr: Virtual Address. 


Store a value to memory. 

The specified data is stored into the physical location pAddr using the memory hierarchy (data caches 
and main memory) as specified by the Cache Coherence Algorithm (CCA). The MemElem contains 
the data for an aligned, fixed-width memory element, though only the bytes that will actually be stored 
to memory need to be valid. The low-order four bits of pAddr and the AccessLength field indicates 


which of the bytes within the MemElem data should actually be stored; only these bytes in memory will 
be changed. 


Prefetch (CCA, pAddr, vAddr, DATA, hint) 
CCA: Cache Coherence Algorithm: the method used to access caches and 
memory and resolve the reference. 


pAddr: Physical Address. 

vAddr: Virtual Address. 

DATA: Indicates that access is for DATA. 

hint: Hint that indicates the possible use of the data 
Prefetch data from memory. 
Prefetch is an advisory instruction for which an implementation specific action is taken. The action 
taken may increase performance but must not change the meaning of the program or alter 
architecturally-visible state. 


Table A-4. AccessLength Specifications for Loads / Stores 


AccessLength Value 
name 


QUADWORD 16 bytes (128 bits) 
DOUBLEWORD 8 bytes (64 bits 
SEPTIBYTE 
SEXTIBYTE 
QUINTIBYTE 
WORD 
TRIPLEBYTE 
HALFWORD 
BYTE 


=" 
ol 


) 
7 bytes (56 bits) 
6 bytes (48 bits) 
5 bytes (40 bits) 
4 bytes (32 bits) 

) 

) 


3 bytes (24 bits 


2 bytes (16 bits 
1 byte (8 bits) 


OorFt NUN WB HD N 
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A.2.2.3 Miscellaneous Functions 
Table A-5 describes additional miscellaneous functions for CPU instruction descriptions. 


Table A-5. Miscellaneous Functions 


SyncOperation (stype) 
stype: Type of synchronization operation to be performed. 


Based on the value of stype either a memory barrier operation is performed or a pipeline barrier 
operation is performed. 


In case of a memory barrier all pending loads and stores are retired. Loads are retired when the 
destination register is written. Stores are retired when the stored data (in store buffers or write buffers) is 
either stored in the data cache, or sent on the processor bus. 


All uncached accelerated data gathering operation is terminated. 

The uncached accelerated buffer is invalidated. 

All bus read processes due to load/store/pref/cache instructions are completed. 
All pending bus write processes in the write back buffer are completed. 


In case of pipeline barrier all instructions prior to the barrier are completed before the instructions 
following the barrier operation are fetched. Note that the barrier operation does not wait for any 
instruction which was issued prior to the barrier operation but not retired (e.g., multiply, divide, multicycle 
COP1 operations or a pending load which were issued prior to the pipeline barrier operation). 


SignalException (Exception) 


Exception; The exception condition that exists. 
Signal an exception condition. 


This will result in an exception that aborts the instruction. The instruction operation pseudocode will 
never see a return from this function call. 


UndefinedResult() 
This function indicates that the result of the operation is undefined. 


NullifyCurrentinstruction() 
Nullify the current instruction. 


This occurs during the instruction time for some instruction and that instruction is not executed further. 
This appears for branch-likely instructions during the execution of the instruction in the delay slot and it 
kills the instruction in the delay slot. 


CoprocessorOperation (z, cop_fun) 

Zz: Coprocessor unit number 

cop_fun: Coprocessor function from function field of instruction 
Perform the specified Coprocessor operation. 
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A.3 CPU Instruction Formats 
A CPU instruction is a single 32-bit aligned word. There are three instruction formats: 


Immediate (l-type), J ump (J -type), and Register (R-type). These formats are shown in 
Figure A-1 below: 


I-Type (Immediate) 


31 26 25 21 20 16 15 0 
Se fs | ot | __ immediate 
6 5 5 16 


6-bit primary operation code 
5-bit destination register specifier 


5-bit target (Source/destination) register specification or 
branch condition 


immediate 16-bit signed immediate used for: logical operands, arithmetic 
signed operands, load/store address byte offsets, PC-relative 
branch signed instruction displacement 


26-bit index shifted left two bits to supply the low-order 28 bits 
of the jump target address. 


5-bit shift amount 


5-bit source register specifier 


6-bit function field used to specify functions within the primary 
operation code value SPECIAL 


Figure A-1. CPU Instruction Formats 
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A.4 Instruction Descriptions 


The user-level CPU instructions are described in alphabetical order in this section. 
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on Add Word Rei 


26 25 21 20 16 15 11 10 
ae Te Ts [abe | a 
000000 00000 100000 
MIPS | 
Format: ADD rd, rs, rt 
Purpose: To add 32-bit integers. If overflow occurs, then trap. 
Description: rd<rs+rt 


The 32-bit word value in GPR rt is added to the 32-bit value in GPR rs to produce a 32-bit 
result. If the addition results in 32-bit 2’s complement arithmetic overflow then the 
destination register is not modified and an Integer Overflow exception occurs. If it does 
not overflow, the 32-bit result is placed into GPR rd. 


Restrictions: 


If either GPR rt or GPR rs do not contain sign-extended 32-bit values (bits 63..31 equal), 
then the result of the operation is undefined. 


Operation: 
If (NotWordValue (GPR[rs] 63..0) or NotWordValue (GPR[rt] 63..0)) then UndefinedResult()endif 
temp < GPRIrs] 63.0 +GPR[rt] 63.0 
if (32_bit_arithmetic_overflow) then 
SignalE xception (I ntegerOverflow) 
else 
GPR[rdl]e3..0 <— sign_extend (temp:z..0) 
endif 


Exceptions: 


Integer Overflow 


Programming Notes: 


ADDU performs the same arithmetic operation but, does not trap on overflow. 
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ADDI Add Immediate Word ADDI 


31 26 25 21 20 16 15 0 
ADDI ; F 
6 5 5 16 
MIPS | 
Format: ADDI rt, rs, immediate 
Purpose: To add a constant to a 32-bit integer. If overflow occurs, then trap. 
Description: rt — rs + immediate 


The 16-bit signed immediate is added to the 32-bit value in GPR rs to produce a 32-bit 
result. If the addition results in 32-bit 2’s complement arithmetic overflow then the 
destination register is not modified and an Integer Overflow exception occurs. If it does 
not overflow, the 32-bit result is placed into GPR rt. 


Restrictions: 


If GPR rs does not contain a sign-extended 32-bit value (bits 63..31 equal), then the result 
of the operation is undefined. 


Operation: 


if (NotWordValue (GPR[rs] 63..0)) then UndefinedResult() endif 
temp < GPRIrs] 63.0 +sign_extend (immediate) 
if (32_bit_arithmetic_overflow) then 
SignalE xception (I ntegerOverflow) 
else 
GPRI[rt]e3..0 < sign_extend (temp:z..0) 
endif 


Exceptions: 


Integer Overflow 


Programming Notes: 


ADDIU performs the same arithmetic operation but, does not trap on overflow. 
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ADDIU Add Immediate Unsigned Word ADDIU 


31 26 25 21 20 16 15 0 
ADDIU : ; 
6 5 5 16 
MIPS | 
Format: ADDIU rt, rs, immediate 
Purpose: To add a constant to a 32-bit integer. 
Description: rt — rs + immediate 


The 16-bit signed immediate is added to the 32-bit value in GPR rs and the 32-bit 
arithmetic result is placed into GPR rt. 
NoInteger Overflow exception occurs under any circumstances. 

Restrictions: 
If GPR rs does not contain a sign-extended 32-bit value (bits 63..31 equal), then the result 
of the operation is undefined. 


Operation: 


if (NotWordValue (GPR[rs] 63..0)) then UndefinedResult( ) endif 
temp < GPRIrs] 63.0 +sign_extend (immediate) 
GPRI[rt] 63..0<— sign_extend (temp:z..0) 

Exceptions: 


None 


Programming Notes: 


The term “unsigned” in the instruction name is a misnomer; this operation is 32-bit 
modulo arithmetic that does not trap on overflow. It is appropriate for arithmetic which is 
not signed, such as address arithmetic, or integer arithmetic environments that ignore 
overflow, such as C language arithmetic. 
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fai Add Unsigned Word ae 


26 25 21 20 16 15 11. 10 
ame Te | s [abe | 
000000 00000 100001 
MIPS | 
Format: ADDU td, rs, rt 
Purpose: To add 32-bit integers. 
Description: rd<rs+rt 


The 32-bit word value in GPR rt is added to the 32-bit value in GPR rs and the 32-bit 
arithmetic result is placed into GPR rd. 
No Integer Overflow exception occurs under any circumstances. 

Restrictions: 
If either GPR rt or GPR rs do not contain sign-extended 32-bit values (bits 63..31 equal), 
then the result of the operation is undefined. 


Operation: 
if (NotWordValue (GPR[rs] 63.0) or NotWordValue (GPR[rt] 63..0)) then UndefinedResult() endif 
temp < GPRIrs] 63.0 +GPRI[rt] 63.0 
GPRI[rt] 63..0 <-sign_extend (temp:z..0) 

Exceptions: 


None 


Programming Notes: 


The term “unsigned” in the instruction name is a misnomer; this operation is 32-bit 
modulo arithmetic that does not trap on overflow. It is appropriate for arithmetic which is 
not signed, such as address arithmetic, or integer arithmetic environments that ignore 
overflow, such as C language arithmetic. 


A-14 


TX 
TOSHIBA Appendix A CPU Instruction Set Details We 


ak And er 


26 25 21 20 16 15 11 10 
aes Te Ts [ode | aR 
000000 00000 100100 
MIPS | 
Format: AND rd, rs, rt 
Purpose: To do a bitwise logical AND. 
Description: rd <— rs AND rt 


The contents of GPR rs are combined with the contents of GPR rtin a bitwise logical AND 
operation. The result is placed into GPR rd. 


Restrictions: 

None 
Operation: 

GPR[rd] 63.0 <— GPRIrs] 63.0 and GPR[rt] 63.0 
Exceptions: 

None 


Programming Notes: 


None 
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AN DI And Immediate ANDI 


31 26 25 21 20 16 15 0 
ANDI : : 
6 5 5 16 
MIPS | 
Format: ANDI rt, rs, immediate 
Purpose: To do a bitwise logical AND with a constant. 
Description: rt — rs AND immediate 


The 16-bit immediate is zero-extended to the left and combined with the contents of GPR 
rsin a bitwise logical AND operation. The result is placed into GPR rt. 


Restrictions: 
None 


Operation: 
GPR[rt] 63.0 <- zero_extend (immediate) and GPRIrs] 63.0 
Exceptions: 


None 


Programming Notes: 


None 
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BEQ Branch on Equal BEQ 


31 26 25 21 20 16 15 0 
BEQ 
6 5 5 16 
MIPS | 
Format: BEQ rs, rt, offset 
Purpose: To compare GPRs then do a PC-relative conditional branch. 
Description: if (rs = rt) then branch 


An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of 
the instruction following the branch (not the branch itself), in the branch delay slot, to 
form a PC-relative effective target address. 


If the contents of GPR rsand GPR rt are equal, branch to the effective target address after 
the instruction in the delay slot is executed. 


Restriction: 


None 


Operation: 


I: — tgt_offset < sign_extend (offset || 0) 
condition — (GPR[rs] 63.0 =GPR[rt] 63.0) 
I+1: if condition then 
PC < PC +tgt_offset 
endif 


Exceptions: 
None 


Programming Notes: 


With the 18-bit signed instruction offset, the conditional branch range is + 128 KB. Use 
jump (J ) or jump register (J R) instructions to branch to more distant addresses. 
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8B EQL Branch on Equal Likely B EQL 


31 26 25 21 20 16 15 0 
BEQL 
6 5 5 16 
MIPS Il 
Format: BEQL rs, rt, offset 
Purpose: To compare GPRs then do a PC-relative conditional branch; execute the delay slot only if 


the branch is taken. 


Description: if (rs = rt) then branch_likely 


An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of 
the instruction following the branch (not the branch itself), in the branch delay slot, to 
form a PC-relative effective target address. 


If the contents of GPR rs and GPR rt are equal, branch to the target address after the 
instruction in the delay slot is executed. If the branch is not taken, the instruction in the 
delay slot is not executed. 


Restrictions: 


None 


Operation: 
I: — tgt_offset < sign_extend (offset || 0) 
condition <— (GPR[rs] 63.0 =GPRIrt] 63.0) 
I+1: if condition then 
PC « PC +tgt_offset 
else 
NullifyCurrentl nstruction() 
endif 


Exceptions: 
None 


Programming Notes: 


With the 18-bit signed instruction offset, the conditional branch range is + 128 KB. Use 
jump (J ) or jump register (J R) instructions to branch to more distant addresses. 
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BG EZ Branch on Greater Than or Equal to Zero BG EZ 


31 26 25 21 20 16 15 0 
000001 00001 
6 5 5 16 
MIPS | 
Format: BGEZ fs, offset 
Purpose: To test a GPR then do a PC-relative conditional branch. 
Description: if (rs = 0) then branch 


An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of 
the instruction following the branch (not the branch itself), in the branch delay slot, to 
form a PC-relative effective target address. 


If the contents of GPR rs are greater than or equal to Zero (sign bit is 0), branch to the 
effective target address after the instruction in the delay slot is executed. 


Restrictions: 


None 


Operation: 


I: tgt_offset < sign_extend (offset || 02) 
condition <— GPR[rs] 63.02 OGPRLEN 
I+1: if condition then 
PC « PC +tgt_offset 
endif 


Exceptions: 
None 


Programming Notes: 


With the 18-bit signed instruction offset, the conditional branch range is + 128 KB. Use 
jump (J ) or jump register (J R) instructions to branch to more distant addresses. 
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BG EZAL Branch on Greater Than or Equal to Zero and Link BG EZAL 


31 26 25 21 20 16 15 


0 
REGIMM 2 BGEZAL me 
000001 10001 eee, 
6 5 5 16 


MIPS | 
Format: BGEZAL rs, offset 
Purpose: To test a GPR then do a PC-relative conditional procedure call. 
Description: if (rs = 0) then procedure_call 


Place the return address link in GPR 31. The return link is the address of the second 
instruction following the branch, where execution would continue after a procedure call. 


An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of 
the instruction following the branch (not the branch itself), in the branch delay slot, to 
form a PC-relative effective target address. 


If the contents of GPR rs are greater than or equal to Zero (sign bit is 0), branch to the 
effective target address after the instruction in the delay slot is executed. 


Restriction: 


GPR 31 must not be used for the source register rs, because such an instruction does not 
have the same effect when re-executed. The result of executing such an instruction is 
undefined. This restriction permits an exception handler to resume execution by re 
executing the branch when an exception occurs in the branch delay slot. 


Operation: 

I: — tgt_offset < sign_extend (offset || 02) 
condition <— GPRIrs] 63..0 => OGPRLEN 
GPR[31] 63.0 < zero_extend (PC+8) 

I+1: if condition then 

PC < PC +tgt_offset 
endif 


Exceptions: 
None 
Programming Notes: 


With the 18-bit signed instruction offset, the conditional branch range is + 128 KB. Use 
jump and link (J AL) or jump and link register (J ALR) instructions for procedure calls to 
more distant addresses. 
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BG EZALL Branch on Greater ae to Zero and Link BG EZALL 


31 26 25 21 20 16 15 0 
000001 10011 
6 5 5 16 
MIPS Il 
Format: BGEZALL rs, offset 
Purpose: To test a GPR then do a PC-relative conditional procedure call; execute the delay slot only 


if the branch is taken. 


Description: if (rs = 0) then procedure_call_likely 


Place the return address link in GPR 31. The return link is the address of the second 
instruction following the branch, where execution would continue after a procedure call. 


An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of 
the instruction following the branch (not the branch itself), in the branch delay slot, to 
form a PC-relative effective target address. 


If the contents of GPR rs are greater than or equal to Zero (sign bit is 0), branch to the 
effective target address after the instruction in the delay slot is executed. If the branch is 
not taken, the instruction in the delay slot is not executed. 


Restrictions: 


GPR 31 must not be used for the source register rs, because such an instruction does not 
have the same effect when re-executed. The result of executing such an instruction is 
undefined. This restriction permits an exception handler to resume execution by re 
executing the branch when an exception occurs in the branch delay slot. 


Operation: 

I: — tgt_offset < sign_extend (offset || 02) 
condition — GPR[rs] 63.0 > OGPRLEN 
GPR[31] 63.0 < zero_extend (PC+8) 

I+1: if condition then 

PC < PC +tgt_offset 
else 
NullifyCurrentl nstruction() 
endif 
Exceptions: 


None 
Programming Notes: 


With the 18-bit signed instruction offset, the conditional branch range is + 128 KB. Use 
jump and link (J AL) or jump and link register (J ALR) instructions for procedure calls to 
more distant addresses. 
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BG EZ L Branch on Greater Than or Equal to Zero Likely BG EZ L 


31 26 25 21 20 16 15 


0 
REGIMM im BGEZL rae 
000001 00011 elise 
6 5 5 


16 


MIPS Il 
Format: BGEZL rs, offset 


Purpose: To test a GPR then do a PC-relative conditional branch; execute the delay slot only if the 
branch is taken. 


Description: if (rs = 0) then branch_likely 


An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of 
the instruction following the branch (not the branch itself), in the branch delay slot, to 
form a PC-relative effective target address. 


If the contents of GPR rs are greater than or equal to Zero (sign bit is 0), branch to the 
effective target address after the instruction in the delay slot is executed. If the branch is 
not taken, the instruction in the delay slot is not executed. 


Restrictions: 


None 


Operation: 
I: tgt_offset < sign_extend (offset || 02) 
condition — GPRI[rs] 63.0 => OGPRLEN 
I+1: if condition then 
PC « PC +tgt_offset 
else 
NullifyCurrentl nstruction() 
endif 


Exceptions: 
None 


Programming Notes: 


With the 18-bit signed instruction offset, the conditional branch range is + 128 KB. Use 
jump (J ) or jump register (J R) instructions to branch to more distant addresses. 
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BGTZ Branch on Greater Than Zero BGTZ 


31 26 25 21 20 16 15 0 
000111 00000 
6 5 5 16 
MIPS | 
Format: BGTZ rs, offset 
Purpose: To test a GPR then do a PC-relative conditional branch. 
Description: if (rs > 0) then branch 


An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of 
the instruction following the branch (not the branch itself), in the branch delay slot, to 
form a PC-relative effective target address. 


If the contents of GPR rs are greater than zero (sign bit is 0 but value not zero), branch to 
the effective target address after the instruction in the delay slot is executed. 


Restrictions: 


None 


Operation: 


I: tgt_offset < sign_extend (offset || 0) 
condition — GPR[rs]63..0 > OGPRLEN 
I+1: if condition then 
PC « PC +tgt_offset 
endif 


Exceptions: 
None 


Programming Notes: 


With the 18-bit signed instruction offset, the conditional branch range is + 128 KB. Use 
jump (J ) or jump register (J R) instructions to branch to more distant addresses. 
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BGTZL Branch on Greater Than Zero Likely BGTZL 


31 26 25 21 20 16 15 0 
BGTZL 0 
6 5 5 16 
MIPS Il 
Format: BGTZL rs, offset 
Purpose: To test a GPR then do a PC-relative conditional branch; execute the delay slot only if the 


branch is taken. 


Description: if (rs > 0) then branch_likely 


An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of 
the instruction following the branch (not the branch itself), in the branch delay slot, to 
form a PC-relative effective target address. 


If the contents of GPR rs are greater than zero (sign bit is 0 but value not zero), branch to 
the effective target address after the instruction in the delay slot is executed. If the branch 
is not taken, the instruction in the delay slot is not executed. 


Restrictions: 


None 


Operations: 
I: tgt_offset < sign_extend (offset || 0) 
condition — GPR[rs] 63.0 >OGPRLEN 
I+1: if condition then 
PC < PC +tgt_offset 
else 
NullifyCurrentl nstruction() 
endif 
Exceptions: 


None 


Programming Notes: 


With the 18-bit signed instruction offset, the conditional branch is + 128 KB. Use jump (J ) 
or jump register (J R) instructions to branch to more distant addresses. 
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B LEZ Branch on Less Than or Equal to Zero B LEZ 


31 26 25 21 20 16 15 0 
BLEZ 0 
6 5 5 16 
MIPS | 
Format: BLEZ rs, offset 
Purpose: To test a GPR then do a PC-relative conditional branch. 
Description: if (rs < 0) then branch 


An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of 
the instruction following the branch (not the branch itself), in the branch delay slot, to 
form a PC-relative effective target address. 


If the contents of the GPR rs are less than or equal to zero (sign bit is 1 or value is Zero), 
branch to the effective target address after the instruction in the delay slot is executed. 


Restrictions: 


None 


Operation: 


I: tgt_offset < sign_extend (offset || 02) 
condition — GPR[rs] 63.0 < OGPRLEN 
I+1: if condition then 
PC « PC +tgt_offset 
endif 


Exceptions: 
None 


Programming Notes: 


With the 18-bit signed instruction offset, the conditional branch range is + 128 KB. Use 
jump (J ) or jump register (J R) instructions to branch to more distant addresses. 
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B LEZL Branch on Less Than or Equal to Zero Likely B LEZL 


31 26 25 21 20 16 15 0 
BLEZL 0 
6 5 5 16 
MIPS Il 
Format: BLEZL rs, offset 
Purpose: To test a GPR then do a PC-relative conditional branch; execute the delay slot only if the 


branch is taken. 


Description: if (rs < 0) then branch_likely 


An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of 
the instruction following the branch (not the branch itself), in the branch delay slot, to 
form a PC-relative effective target address. 


If the contents of GPR rs are less than or equal to zero (sign bit is 1 or value is Zero), 
branch to the effective target address after the instruction in the delay slot is executed. If 
the branch is not taken, the instruction in the delay slot is not executed. 


Restrictions: 


None 


Operation: 
I: — tgt_offset < sign_extend (offset || 02) 
condition — GPR[rs] 63.0 < OGPRLEN 
I+1: if condition then 
PC < PC +tgt_offset 
else 
NullifyCurrentl nstruction() 
endif 
Exceptions: 


None 


Programming Notes: 


With the 18-bit signed instruction offset, the conditional branch range is + 128 KB. Use 
jump (J ) or jump register (J R) instructions to branch to more distant addresses. 
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8B LTZ Branch on Less Than Zero B LTZ 


31 26 25 21 20 16 15 0 
REGIMM BLTZ 
6 5 5 16 
MIPS | 
Format: BLTZ rs, offset 
Purpose: To test a GPR then do a PC-relative conditional branch. 
Description: if (rs < 0) then branch 


An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of 
the instruction following the branch (not the branch itself), in the branch delay slot, to 
form a PC-relative effective target address. 


If the contents of GPR rs are less than zero (sign bit is 1), branch to the effective target 
address after the instruction in the delay slot is executed. 


Restrictions: 


None 


Operation: 


I: tgt_offset < sign_extend (offset || 02) 
condition — GPR[rs] 63.0 <OGPRLEN 
I+1: if condition then 
PC < PC +tgt_offset 
endif 


Exceptions: 
None 


Programming Notes: 


With the 18-bit signed instruction offset, the conditional branch range is + 128 KB. Use 
jump (J ) or jump register (J R) instructions to branch to more distant addresses. 
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B LTZAL Branch on Less Than Zero and Link B LTZAL 


31 26 25 21 20 16 15 0 
000001 10000 
6 5 5 16 
MIPS | 
Format: BLTZAL rs, offset 
Purpose: To test a GPR then do a PC-relative conditional procedure call. 
Description: if (rs < 0) then procedure_call 


Place the return address link in GPR 31. The return link is the address of the second 
instruction following the branch (not the branch itself), where execution would continue 
after a procedure call. 


An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of 
the instruction following the branch, in the branch delay slot, to form a PC-relative 
effective target address. 


If the contents of GPR rs are less than zero (sign bit is 1), branch to the effective target 
address after the instruction in the delay slot is executed. 


Restrictions: 


GPR 31 must not be used for the source register rs, because such an instruction does not 
have the same effect when re-executed. The result of executing such an instruction is 
undefined. This restriction permits an exception handler to resume execution by re 
executing the branch when an exception occurs in the branch delay slot. 
Operation: 

I: tgt_offset < sign_extend (offset || 02) 

condition < GPR[rs] 63.0 <OGPRLEN 

GPR[31] 63.0 < zero_extend (PC+8) 
I+1: if condition then 


PC < PC +tgt_offset 
endif 


Exceptions: 
None 


Programming Notes: 


With the 18-bit signed instruction offset, the conditional branch range is + 128 KB. Use 
jump and link (J AL) or jump and link register (J ALR) instructions for procedure calls to 
more distant addresses. 
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B LTZAL L Branch on Less Than Zero and Link Likely B LTZAL L 


31 26 25 21 20 16 15 0 
000001 10010 
6 5 5 16 
MIPS II 
Format: BLTZALL rs, offset 
Purpose: To test a GPR then do a PC-relative conditional procedure call; execute the delay slot only 


if the branch is taken. 


Description: if (rs < 0) then procedure_call_likely 


Place the return address link in GPR 31. The return link is the address of the second 
instruction following the branch (not the branch itself), where execution would continue 
after a procedure call. 


An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of 
the instruction following the branch, in the branch delay slot, to form a PC-relative 
effective target address. 


If the contents of GPR rs are less than zero (sign bit is 1), branch to the effective target 
address after the instruction in the delay slot is executed. If the branch is not taken, the 
instruction in the delay slot is not executed. 


Restrictions: 


GPR 31 must not be used for the source register rs, because such an instruction does not 
have the same effect when re-executed. The result of executing such an instruction is 
undefined. This restriction permits an exception handler to resume execution by re 
executing the branch when an exception occurs in the branch delay slot. 


Operation: 

I: tgt_offset< sign_extend (offset || 02) 
condition ~— GPR[rs] 63.0 <OGPRLEN 
GPR[31] 63.0 < zero_extend (PC+8) 

I+1: if condition then 

PC < PC +tgt_offset 
else 
NullifyCurrentl nstruction() 
endif 
Exceptions: 


None 
Programming Notes: 


With the 18-bit signed instruction offset, the conditional branch range+ 128 KB. Use jump 
and link (J AL) or jump and link register (J ALR) instructions for procedure calls to more 
distant addresses. 
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B LTZ L Branch on Less Than Zero Likely B LTZ L 


31 26 25 21 20 16 15 0 
000001 00010 
6 5 5 16 
MIPS Il 
Format: BLTZL rs, offset 
Purpose: To test a GPR then do a PC-relative conditional branch; execute the delay slot only if the 


branch is taken. 


Description: if (rs < 0) then branch_likely 


An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of 
the instruction following the branch (not the branch itself), in the branch delay slot, to 
form a PC-relative effective target address. 


If the contents of GPR rs are less than zero (sign bit is 1), branch to the effective target 
address after the instruction in the delay slot is executed. If the branch is not taken, the 
instruction in the delay slot is not executed. 


Restrictions: 


None 


Operation: 


I: — tgt_offset < sign_extend (offset || 02) 
condition — GPR[rs] 63.0 <OGPRLEN 
I+1: if condition then 
PC < PC +tgt_offset 
else 
NullifyCurrentl nstruction() 
endif 


Exceptions: 
None 


Programming Notes: 


With the 18-bit signed instruction offset, the conditional branch range is + 128 KB. Use 
jump (J ) or jump register (J R) instructions to branch to more distant addresses. 
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B N E Branch on Not Equal B N Ee 


31 26 25 21 20 16 15 0 
BNE 
r r 
6 5 5 16 
MIPS | 

Format: BNE rs, rt, offset 

Purpose: To compare GPRs then do a PC-relative conditional branch. 

Description: if (rs # rt) then branch 


An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of 
the instruction following the branch (not the branch itself), in the branch delay slot, to 
form a PC-relative effective target address. 


If the contents of GPR rs and GPR rt are not equal, branch to the effective target address 
after the instruction in the delay slot is executed. 


Restrictions: 


None 


Operation: 


I: tgt_offset < sign_extend (offset || 02) 
condition — (GPRIrs] 63.0 # GPR[rt] 63.0) 
I+1: if condition then 
PC « PC +tgt_offset 
endif 


Exceptions: 
None 


Programming Notes: 


With the 18-bit signed instruction offset, the conditional branch range is + 128 KB. Use 
jump (J ) or jump register (J R) instructions to branch to more distant addresses. 
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B N EL Branch on Not Equal Likely B N EL 


31 26 25 21 20 16 15 0 
BNEL 
6 5 5 16 
MIPS Il 
Format: BNEL rs, rt, offset 
Purpose: To compare GPRs then do a PC-relative conditional branch; execute the delay slot only if 


the branch is taken. 


Description: if (rs # rt) then branch_likely 


An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of 
the instruction following the branch (not the branch itself), in the branch delay slot, to 
form a PC-relative effective target address. 


If the contents of GPR rs and GPR rt are not equal, branch to the effective target address 
after the instruction in the delay slot is executed. If the branch is not taken, the 
instruction in the delay slot is not executed. 


Restrictions: 


None 


Operation: 
I: tgt_offset < sign_extend (offset || 02) 
condition — (GPRIrs] 63.0 # GPR[rt] 63.0) 
I+1: if condition then 
PC < PC +tgt_offset 
else 
NullifyCurrentl nstruction() 
endif 
Exceptions: 


None 


Programming Notes: 


With the 18-bit signed instruction offset, the conditional branch range is + 128 KB. Use 
jump (J ) or jump register (J R) instructions to branch to more distant addresses. 
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BREAK Breakpoint BREAK 


31 26 25 6 5 0 
SPECIAL BREAK 
6 20 6 
MIPS | 
Format: BREAK 
Purpose: To cause a Breakpoint exception. 
Description: 


A breakpoint exception occurs, immediately and unconditionally transferring control to 
the exception handler. 


The code field is available for use as software parameters, but is retrieved by the exception 
handler only by loading the contents of the memory word containing the instruction. 


Restrictions: 
None 


Operation: 
SignalE xception (Breakpoint) 
Exceptions: 


Breakpoint 


Programming Notes: 


None 
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e aaa? Doubleword Add ADD 


21 20 16 15 11 10 
eee Pe Ds Tbe | 2 
ae 00000 1 a 100 
MIPS III 
Format: DADD rd, rs, rt 
Purpose: To add 64-bit integers. If overflow occurs, then trap. 
Description: rd<rs+rt 


The 64-bit doubleword value in GPR rt is added to the 64-bit value in GPR rs to produce a 
64-bit result. If the addition results in 64-bit 2’s complement arithmetic overflow then the 
destination register is not modified and an Integer Overflow exception occurs. If it does 
not overflow, the 64-bit result is placed into GPR rd. 


Restrictions: 


None 


Operation: 


temp < GPRIrs] 63.0 +GPR[rt] 63..0 
if (64_bit_arithmetic_overflow) then 
SignalE xception (I ntegerOverflow) 
else 
GPR[rd] 63.0 < temp 
endif 


Exceptions: 


Integer Overflow 


Programming Notes: 


DADDU performs the same arithmetic operation but, does not trap on overflow. 
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DADDI Doubleword Add Immediate DADDI 


31 26 25 21 20 16 15 0 
DADDI : : 
6 5 5 16 
MIPS III 
Format: DADDI rt, rs, immediate 
Purpose: To add a constant to a 64-bit integer. If overflow occurs, then trap. 
Description: rt — rs + immediate 


The 16-bit signed immediate is added to the 64-bit value in GPR rs to produce a 64-bit 
result. If the addition results in 64-bit 2’s complement arithmetic overflow then the 
destination register is not modified and an Integer Overflow exception occurs. If it does 
not overflow, the 64-bit result is placed into GPR rt. 


Restrictions: 


None 


Operation: 


temp < GPRIrs] 63.0 +sign_extend (immediate) 
if (64_bit_arithmetic_overflow) then 
SignalE xception (I ntegerOverflow) 
else 
GPR[rt] 63.0 << temp 
endif 


Exceptions: 


Integer Overflow 


Programming Notes: 


DADDIU performs the same arithmetic operation but, does not trap on overflow. 
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DADDIU Doubleword Add Immediate Unsigned DADDIU 


31 26 25 21 20 16 15 0 
DADDIU : : 
6 5 5 16 
MIPS Il 
Format: DADDIU rt, rs, immediate 
Purpose: To add a constant to a 64-bit integer. 
Description: rt — rs + immediate 


The 16-bit signed immediate is added to the 64-bit value in GPR rs and the 64-bit 
arithmetic result is placed into GPR rt. 


NoInteger Overflow exception occurs under any circumstances. 
Restrictions: 
None 
Operation: 
GPR[rt] 63.0 < GPRIrs] 63.0 +sign_extend (immediate) 
Exceptions: 
None 
Programming Notes: 
The term “unsigned” in the instruction name is a misnomer; this operation is 64-bit 
modulo arithmetic that does not trap on overflow. It is appropriate for arithmetic which is 


not signed, such as address arithmetic, or integer arithmetic environments that ignore 
overflow, such as C language arithmetic. 


A-36 


TX 
TOSHIBA Appendix A CPU Instruction Set Details SE 


e ADDU Doubleword Add Unsigned D haat 


26 25 21 20 16 15 11 10 
ames [os Ls [abe 
ae 00000 1 101 
MIPS III 
Format: DADDU rtd, rs, rt 
Purpose: To add 64-bit integers. 
Description: rd<rs+rt 


The 64-bit doubleword value in GPR rt is added to the 64-bit value in GPR rs and the 64- 
bit arithmetic result is placed into GPR rd. 


NoInteger Overflow exception occurs under any circumstances. 
Restrictions: 
None 
Operation: 
GPR[rd] 63.0 < GPRIrs] 63.0 +GPR[rt] 63.0 
Exception: 
None 
Programming Notes: 
The term “unsigned” in the instruction name is a misnomer; this operation is 64-bit 
modulo arithmetic that does not trap on overflow. It is appropriate for arithmetic which is 


not signed, such as address arithmetic, or integer arithmetic environments that ignore 
overflow, such as C language arithmetic. 
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DIV Divide Word DIV 


31 26 25 21 20 16 15 6 5 0 
000000 00 0000 0000 011010 
6 5 5 10 6 
MIPS | 
Format: DIV fs, rt 
Purpose: To divide 32-bit signed integers. 
Description: (LO, HI) < rs/rt 


The 32-bit word value in GPR rs is divided by the 32-bit value in GPR rt, treating both 
operands as signed values. The 32-bit quotient is placed into special register LO and the 
32-bit remainder is placed into special register H/. 
No arithmetic exception occurs under any circumstances. 

Restrictions: 
If either GPR rt or GPR rs do not contain sign-extended 32-bit values (bits 63..31 equal), 
then the result of the operation is undefined. 
If the divisor in GPR rtis zero, the arithmetic result value is undefined. 


Operation: 


if (NotWordValue (GPRIrs]) or NotWordValue (GPR[rt])) then UndefinedResult() endif 
q < GPRIrs]s1..0 div GPR[rt]s1..0 
LOe3..0 < sign_extend (q31..0) 
r <GPRIrs]s1..0 mod GPR[rt]s1..0 
Hle3.0 < sign_extend (r31.0) 
Exceptions: 


None 


Supplementary Explanation: 


Normally, when Ox80000000 (-2147483648) the signed minimum value is divided by 
OxFFFFFFFF (-1), the operation will result in an overflow. However, in this instruction an 
overflow exception doesn’t occur and the result will be as follows: 


Quotient is Ox80000000 (-2147483648), and remainder is OxOO000000 (0). 


This sign of the quotient and the remainder is based on the signs of the dividend and the 
divisor as shown in the table below: 
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Dividend Remainder 


Programming Notes: 


In the C790, the integer divide operation proceeds asynchronously and allows other CPU 
instructions to execute before it is retired. An attempt to read LO or H/ before the results 
are written will wait (interlock) until the results are ready. Asynchronous execution does 
not affect the program result, but offers an opportunity for performance improvement by 
scheduling the divide so that other instructions can execute in parallel. 


No arithmetic exception occurs under any circumstances. If divide-by-zero or overflow 
conditions should be detected and some action taken, then the divide instruction is 
typically followed by additional instructions to check for a zero divisor and / or for overflow. 
If the divide is asynchronous then the zero-divisor check can execute in parallel with the 
divide. The action taken on either divide-by-zero or overflow is either a convention within 
the program itself or more typically, the system software; one possibility is to take a 
BREAK exception with a code field value to signal the problem to the system software. 


As an example, the C programming language in a UNIX environment expects division by 
zero to either terminate the program or execute a program-specified signal handler. C 
does not expect overflow to cause any exceptional condition. If the C compiler uses a divide 
instruction, it also emits code to test for a zero divisor and execute a BREAK instruction to 
inform the operating system if one is detected. 


In the C790, sign-extended 32-bit values (bits 63..31) are ignored on divide operation. 
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DIVU Divide Unsigned Word DIVU 


31 26 25 21 20 16 15 6 5 0 
000000 00 0000 0000 011011 
6 5 5 10 6 
MIPS | 
Format: DIVU fs, rt 
Purpose: To divide 32-bit unsigned integers. 
Description: (LO, HI) < rs/rt 


The 32-bit word value in GPR rs is divided by the 32-bit value in GPR rt, treating both 
operands as unsigned values. The 32-bit quotient is placed into special register LO and 
the 32-bit remainder is placed into special register H/. 
No arithmetic exception occurs under any circumstances. 

Restrictions: 
If either GPR rt or GPR rs do not contain sign-extended 32-bit values (bits 63..31 equal), 
then the result of the operation is undefined. 
If the divisor in GPR rtis zero, the arithmetic result is undefined. 


Operation: 


if (NotWordValue (GPR[rs]) or NotWordValue (GPR[rt])) then UndefinedResult() endif 
q < (0|| GPRIrs]s1..0) div (0 || GPR[rt]s1..0) 
LOes..0 < sign_extend (q31..0) 
r < (0]| GPR[rs]31..0) mod (0 || GPR[rt]s1..0) 
Hl63.0 < sign_extend (131.0) 
Exceptions: 


None 


Programming Notes: 


See the Programming Notes for the DIV instruction. 
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vine? Doubleword Shift Left Logical iioae 


21 20 16 15 11 10 
aa abe Ds [= 
ae 00000 11 fe 
MIPS III 
Format: DSLL rd, rt, sa 
Purpose: To left shift a doubleword by a fixed amount — 0 to 31 bits. 
Description: rd<tt<<sa 


The 64-bit doubleword contents of GPR rt are shifted left, inserting zeros into the emptied 
bits; the result is placed in GPR rd. The bit shift count in the range 0 to 31 is specified by 
Sa. 


Restrictions: 
None 


Operation: 


s<O|lsa 
GPR[rd] 63.0 <-GPR[rt]63-s)..0 || OS 


Exceptions: 


None 


Programming Notes: 


None 
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ice Doubleword Shift Left Logical Plus 32 came 


26 25 21 20 16 15 11. 10 
eae ode Ds [= [ae 
ae 00000 111 oe 
MIPS Il 
Format: DSLL32 rd, rt, sa 
Purpose: To left shift a doubleword by a fixed amount — 32 to 63 bits. 
Description: rd < tt << (sa + 32) 


The 64-bit doubleword contents of GPR rt are shifted left, inserting zeros into the emptied 
bits; the result is placed in GPR rd. The bit shift count in the range 32 to 63 is specified by 
sa +32. 


Restrictions: 
None 


Operation: 


s<¢1|lsa /* 32 +sa */ 
GPRIrd]e3.0 < GPR[rt](635)..0 |] OS 


Exceptions: 


None 


Programming Notes: 


None 
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vine Doubleword Shift Left Logical Variable “ane 


26 25 21 20 16 15 11 10 
amare |e Ls [de | Ba 
ae 00000 01 a 00 
MIPS III 
Format: DSLLV rd, rt, rs 
Purpose: To left shift a doubleword by a variable number of bits. 
Description: rd<rt<<rs 


The 64-bit doubleword contents of GPR rt are shifted left, inserting zeros into the emptied 
bits; the result is placed in GPR rd. The bit shift count in the range 0 to 63 is specified by 
the low-order six bits in GPR rs. 


Restrictions: 
None 


Operation: 


s < 0]|| GPRIrs]s..0 
GPR[rd] 63.0 < GPR[rt](635)..0 || 05 


Exceptions: 


None 


Programming Notes: 


None 
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he R es Doubleword Shift Right Arithmetic Aid RA 


21 20 16 15 11 10 
aa ode Ds [= La 
ae 00000 11 uh 1 
MIPS Il 
Format: DSRA rd, rt, sa 
Purpose: To arithmetic right shift a doubleword by a fixed amount — 0 to 31 bits. 
Description: rd<«rt>>sa_ (arithmetic) 


The 64-bit doubleword contents of GPR rt are shifted right, duplicating the sign bit (63) 
into the emptied bits; the result is placed in GPR rd. The bit shift count in the range 0 to 
31 is specified by sa. 


Restrictions: 
None 


Operation: 


s < O]jsa 
GPRIrd]63.0 < (GPR[rt]e3)s || GPR[rt]es..s 


Exceptions: 


None 


Programming Notes: 


None 
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hi RA32 Doubleword Shift Right Arithmetic Plus 32 Be RA32 


26 25 21 20 16 15 11. 10 
aa ode es [= [Se 
ae 00000 111111 
MIPS Il 
Format: DSRA32 rd, rt, sa 
Purpose: To arithmetic right shift a doubleword by a fixed amount — 32-63 bits. 
Description: rd<rt>>(sa+32) (arithmetic) 


The doubleword contents of GPR rt are shifted right, duplicating the sign bit (63) into the 
emptied bits; the result is placed in GPR rd. The bit shift count in the range 32 to 63 is 
specified by sa + 32. 


Restrictions: 
None 


Operation: 


s <1]||sSa /* 32 +sa*/ 
GPRIrd]e3.0 <(GPR[rt]e3)s || GPR[rt]es..s 


Exceptions: 


None 


Programming Notes: 


None 
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he RAV Doubleword Shift Right Arithmetic Variable oh RAV 


26 25 21 20 16 15 11 10 
amare | Le [ode | Ba 
ae 00000 01 a 11 
MIPS III 
Format: DSRAV rd, rt, rs 
Purpose: To arithmetic right shift a doubleword by a variable number of bits. 
Description: rd<rt>>rs_ (arithmetic) 


The doubleword contents of GPR rt are shifted right, duplicating the sign bit (63) into the 
emptied bits; the result is placed in GPR rd. The bit shift count in the range 0 to 63 is 
specified by the low-order six bits in GPR rs. 


Restrictions: 
None 


Operation: 


s — GPRIrs]s..o 
GPRIrd]e3.0 < (GPR[rt]e3)s || GPRIrt]e3.s 


Exceptions: 


None 


Programming Notes: 


None 


A-46 


TX 
TOSHIBA Appendix A CPU Instruction Set Details Sie 


hah = Doubleword Shift Right Logical soot L 


21 20 16 15 11 10 
eae ode De [= La 
ae 00000 111 He 0 
MIPS III 
Format: DSRL rd, rt, sa 
Purpose: To logical right shift a doubleword by a fixed amount — 0 to 31 bits. 
Description: rd< rt>>sa_ (logical) 


The doubleword contents of GPR rt are shifted right, inserting zeros into the emptied bits; 
the result is placed in GPR rd. The bit shift count in the range 0 to 31 is specified by sa. 


Restrictions: 
None 


Operation: 


s < O]jsa 
GPR[rd] 63.0 < 0 || GPR[rt]es..s 


Exceptions: 


None 


Programming Notes: 


None 
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hace L32 Doubleword Shift Right Logical Plus 32 ae L32 


26 25 21 20 16 15 11 10 
aa ode Ds [= [a 
ae 00000 1111 a 
MIPS Il 
Format: DSRL32 rd, rt, sa 
Purpose: To logical right shift a doubleword by a fixed amount — 32 to 63 bits. 
Description: rd<rt>>(sa+32) (logical) 


The 64-bit doubleword contents of GPR rt are shifted right, inserting zeros into the 
emptied bits; the result is placed in GPR rd. The bit shift count in the range 32 to 63 is 
specified by sa + 32. 


Restrictions: 
None 


Operation: 


s<¢1|lsa /* 32 +sa*/ 
GPRIrd]e3.0 < 08 || GPR[rtl]es.s 


Exceptions: 


None 


Programming Notes: 


None 


A-48 


TX 
TOSHIBA Appendix A CPU Instruction Set Details Sie 


hace LV Doubleword Shift Right Logical Variable si LV 


26 25 21 20 16 15 11 10 
amare |e Ls [de | Ba 
ae 00000 01 a 10 
MIPS III 
Format: DSRLV rd, rt, rs 
Purpose: To logical right shift a doubleword by a variable number of bits. 
Description: rd< rt>>rs_ (logical) 


The 64-bit doubleword contents of GPR rt are shifted right, inserting zeros into the 
emptied bits; the result is placed in GPR rd. The bit shift count in the range 0 to 63 is 
specified by the low-order six bits in GPR rs. 


Restrictions: 
None 


Operation: 


s <— GPRIrs]s..0 
GPRIrd]e3.0 <0 || GPR[rt]e3..s 


Exceptions: 


None 


Programming Notes: 


None 
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vial : Doubleword Subtract iia B 


21 20 16 15 11 10 


Leeren: DSUB 
ae 0009 1 = 110 


Format: DSUB rd, rs, rt 


MIPS Ill 


Purpose: To subtract 64-bit integers; trap if overflow. 

Description: rd<rs-rt 
The 64-bit doubleword value in GPR rt is subtracted from the 64-bit value in GPR rs to 
produce a 64-bit result. If the subtraction results in 64-bit 2’s complement arithmetic 


overflow then the destination register is not modified and an Integer Overflow exception 
occurs. If it does not overflow, the 64-bit result is placed into GPR rd. 


Restrictions: 


None 


Operation: 


temp < GPRIrs] 63.0 - GPR[rt] 63..0 
if (64_bit_arithmetic_overflow) then 
SignalE xception (I ntegerOverflow) 
else 
GPR[rd] 63.0 < temp 
endif 


Exceptions: 


Integer Overflow 


Programming Notes: 


DSUBU performs the same arithmetic operation but, does not trap on overflow. 
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via B U Doubleword Subtract Unsigned ae B U 


26 25 21 20 16 15 11 10 
ae |e Ls [ode | 
ae 00000 = 111 
MIPS Il 
Format: DSUBU rd, rs, rt 
Purpose: To subtract 64-bit integers. 
Description: rd<rs-rt 


The 64-bit doubleword value in GPR rt is subtracted from the 64-bit value in GPR rs and 
the 64-bit arithmetic result is placed into GPR ra. 


NoInteger Overflow exception occurs under any circumstances. 
Restrictions: 
None 
Operation: 
GPR[rd] 63.0 < GPR[rs] 63.0 - GPR[rt] 63..0 
Exceptions: 
None 
Programming Notes: 
The term “unsigned” in the instruction name is a misnomer; this operation is 64-bit 
modulo arithmetic that does not trap on overflow. It is appropriate for arithmetic which is 


not signed, such as address arithmetic, or integer arithmetic environments that ignore 
overflow, such as C language arithmetic. 
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J Jump J 


31 26 25 0 
J . . 
000010 instr_index 
6 26 
MIPS I 
Format: J target 
Purpose: To branch within the current 256 MB aligned region. 
Description: 


This is a PC-region branch (not PC-relative); the effective target address is in the 
“current” 256 MB aligned region. The low 28 bits of the target address is the instr_index 
field shifted left 2 bits. The remaining upper bits are the corresponding bits of the address 
of the instruction in the delay slot (not the jump itself). 


J ump to the effective target address. Execute the instruction following the jump, in the 
branch delay slot, before jumping. 


Restrictions: 


None 


Operation: 

I: 

I+1: PC < PC31.28 || instr_index || 02 
Exceptions: 


None 


Programming Notes: 


Forming the branch target address by concatenating PC and index bits rather than adding 
a signed offset to the PC is an advantage if all program code addresses fit into a 256 MB 
region aligned on a 256 MB boundary. It allows a branch to anywhere in the region from 
anywhere in the region which a signed relative offset would not allow. 


This definition creates the boundary case where the branch instruction is in the last word 
of a 256 MB region and can therefore only branch to the following 256 MB region 
containing the branch delay slot. 
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JAL Jump and Link JAL 


31 26 25 0 
JAL : : 
000011 instr_index 
6 26 
MIPS | 

Format: JAL target 
Purpose: To procedure call within the current 256 MB aligned region. 
Description: 


Place the return address link in GPR 31. The return link is the address of the second 
instruction following the branch, where execution would continue after a procedure call. 


This is a PC-region branch (not PC-relative); the effective target address is in the 
“current” 256 MB aligned region. The low 28 bits of the target address is the /nstr_index 
field shifted left 2 bits. The remaining upper bits are the corresponding bits of the address 
of the instruction in the delay slot (not the jump itself). 


J ump to the effective target address. Execute the instruction following the jump, in the 
branch delay slot, before jumping. 


Restrictions: 


None 


Operation: 


I: GPR[31]63.0< zero_extend (PC +8) 
I+1: PC © PCsai.28 || instr_index || 02 


Exceptions: 
None 

Programming Notes: 
Forming the branch target address by concatenating PC and index bits rather than adding 
a signed offset to the PC is an advantage if all program code addresses fit into a 256 MB 


region aligned on a 256 MB boundary. It allows a branch to anywhere in the region from 
anywhere in the region which a signed relative offset would not allow. 


This definition creates the boundary case where the branch instruction is in the last word 
of a 256 MB region and can therefore only branch to the following 256 MB region 
containing the branch delay slot. 
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ae Jump and Link Register Seas 


21 20 16 15 11 10 
ame | = [de | = [de | oh 
ae 00000 00000 001 = 
MIPS | 
Format: JALR rs (rd = 31 implied) 
JALR rd, rs 
Purpose: To procedure call to an instruction address in a register. 
Description: rd < return_addr, PC < rs 


Place the return address link in GPR rd. The return link is the address of the second 
instruction following the branch, where execution would continue after a procedure call. 


J ump to the effective target address in GPR rs. Execute the instruction following the jump, 
in the branch delay slot, before jumping. 


Restrictions: 


Register specifiers rs and rd must not be equal, because such an instruction does not have 
the same effect when re-executed. The result of executing such an instruction is undefined. 
This restriction permits an exception handler to resume execution by re-executing the 
branch when an exception occurs in the branch delay slot. 


The effective target address in GPR rs must be naturally aligned. If either of the two 

least-significant bits are not -zero, then an Address Error exception occurs, not for the 

jump instruction, but when the branch target is subsequently fetched as an instruction. 
Operation: 


I: temp < GPR[rs] 31.0 
GPR[rd]63..0 < zero_extend (PC +8) 
I+l: PC « temp 


Exceptions: 
None 
Programming Notes: 


This is the only branch-and-link instruction that can select a register for the return link; 
all other link instructions use GPR 31 The default register for GPR rd, if omitted in the 
assembly language instruction, is GPR 31. 
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JR Jump Register JR 


31 26 25 21 20 65 0 
SPECIAL 0 JR 
6 5 15 6 
MIPS | 
Format: JR rs 
Purpose: To branch to an instruction address in a register. 
Description: PC < rs 


J ump to the effective target address in GPR rs. Execute the instruction following the jump, 
in the branch delay slot, before jumping. 


Restrictions: 


The effective target address in GPR rs must be naturally aligned. If either of the two 
least-significant bits are not-zero, then an Address Error exception occurs, not for the 
jump instruction, but when the branch target is subsequently fetched as an instruction. 


Operation: 


I: temp < GPR[rs] 31.0 
I+l: PC « temp 
Exceptions: 


None 


Programming Notes: 


None 
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L B Load Byte L B 


31 26 25 21 20 16 15 0 
LB 
r 
6 5 5 16 
MIPS | 

Format: LB rt, offset (base) 
Purpose: To load a byte from memory as a signed value. 
Description: rt — memory [base + offset] 


The contents of the 8-bit byte at the memory location specified by the effective address are 
fetched, sign-extended, and placed in GPR rt. The 16-bit signed offset is added to the 
contents of GPR base to form the effective address. 


Restrictions: 


None 


Operation: (128-bit bus) 


vAddr < sign_extend (offset) +GPR[base] 31..0 
(pAddr, uncached) « AddressTranslation (vAddr, DATA, LOAD) 
pAddr < pAddreesize-1).. || (PAddr3..o xor BigE ndian‘) 
memquad < LoadMemory (uncached, BYTE, pAddr, vAddr, DATA) 
byte — vAddrs..o xor BigEndian* 
GPRI[rt]e3..0 < sign_extend (memquad (7+8:byte)..8:byte) 

Exceptions: 


TLB Refill 
TLB Invalid 
Address Error 


Programming Notes: 


None 
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LBU Load Byte Unsigned LBU 


31 26 25 21 20 16 15 0 
LBU 
r 
6 5 5 16 
MIPS | 

Format: LBU rt, offset (base) 

Purpose: To load a byte from memory as an unsigned value. 

Description: rt — memory [base + offset] 


The contents of the 8-bit byte at the memory location specified by the effective address are 
fetched, zero-extended, and placed in GPR rt. The 16-bit signed offset is added to the 
contents of GPR base to form the effective address. 


Restrictions: 


None 


Operation: (128-bit bus) 


vAddr < sign_extend (offset) +GPR[base] 31..0 
(pAddr, uncached) « AddressTranslation (vAddr, DATA, LOAD) 
pAddr <— pAddreesize-1)..4 || (PAddrs..o xor BigE ndian*) 
memquad <- LoadMemory (uncached, BYTE, pAddr, vAddr, DATA) 
byte — vAddrs..o xor BigEndian* 
GPRI[rt]e3..0 << zero_extend (memquad(7+s:+byte)..8+byte) 

Exceptions: 


TLB Refill 
TLB Invalid 
Address Error 


Programming Notes: 


None 
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LD Load Doubleword LD 


31 26 25 21 20 16 15 0 
LD 
6 5 5 16 
MIPS III 
Format: LD rt, offset (base) 
Purpose: To load a doubleword from memory. 
Description: rt — memory [base + offset] 


The contents of the 64-bit doubleword at the memory location specified by the aligned 
effective address are fetched and placed in GPR rt. The 16-bit signed offset is added to the 
contents of GPR base to form the effective address. 


Restrictions: 


The effective address must be naturally aligned. If any of the three least-significant bits of 
the effective address are non-zero, an Address Error exception occurs. 
Operation: (128-bit bus) 
vAddr <-sign_extend (offset) +GPR [base] 31..0 
if (vAddrz..0) # 03 then SignalE xception (AddressE rror) endif 
(pAddr, uncached) < AddressTranslation (vAddr, DATA, LOAD) 
pAddr <— pAddreesize-1)..4 || (PAddrs..o xor (BigE ndian || 03)) 
byte — vAddr3.o || (BigEndian || 03) 
memquad < LoadMemory (uncached, DOUBLEWORD, pAddr, vAddr, DATA) 
GPR[rt]e3..0 — Memquad(63+sbyte)..8+byte 
Exceptions: 


TLB Refill 
TLB Invalid 
Address Error 


Programming Notes: 


None 
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LDL Load Doubleword Left LDL 


31 26 25 21 20 16 15 0 
LDL 
6 5 5 16 
MIPS III 
Format: LDL rt, offset (base) 
Purpose: To load the more-significant part of a doubleword from an unaligned memory address. 
Description: rt ~ rt MERGE memory [base + offset] 


Paired LDL and LDR instructions are used to load a register with a doubleword from 
eight consecutive bytes in memory starting at an arbitrary byte address. LDL loads the 
left (most-significant) bytes and LDR loads the right (least-significant) bytes. 


The instruction adds the 16-bit signed offset to the contents of GPR base to form the 
effective address. This is the address of the most-significant byte of a doubleword 
composed of eight consecutive bytes in memory. LDL loads from one to eight bytes, the 
most-significant bytes of the doubleword, into the corresponding bytes of GPR rt. It loads 
the bytes that are in the target doubleword that are also in the aligned doubleword which 
contains the byte specified by the effective address. 


Conceptually, it starts at the specified byte in memory and loads that byte into the high- 
order (left-most) byte of the register; then it loads bytes from memory into the register 
until it reaches the low-order byte of the doubleword in memory. The least-significant 
(right-most) byte (s) of the register will not be changed. 


memory 


(little-endian) , 
register 


address 0 }7/6|5|4]3]2]1/ 0) peels fHialFlelo}c[e lal $24 


LDL $24,11 ($0) register 


anor [10] 8] 8 [> [o] [A] ses 


memory 


big-endi 
ve eian register 


dd 8 
se ote ee et ell swe [aT 5] e [0] ef] Sa] s 
LDL $24,3 ($0) register 


after f3;4}slje|7|Flia H | $24 


The contents of GPR rt are internally bypassed within the processor so that no NOP is 
needed between an immediately preceding load instruction which specifies register rt and 
a following LDL (or LDR) instruction which also specifies register rt. 
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No address exceptions due to alignment are possible. 


Restrictions: 


None 


Operation: (128-bit bus) 
vAddr < sign_extend (offset) +GPR[base] 31..0 
(pAddr, uncached) < AddressTranslation (vAddr, DATA, LOAD) 
pAddr <— pAddreesize-1).4 || (pAddrs..o xor BigE ndian*) 
if (BigEndian =O) then 
pAddr < pAddrvesize-1).3 || 0? 
endif 
byte — 0 || (vAddrz..0 xor BigEndian?) 
doubleword < vAddrs xor BigEndian 
memquad < LoadMemory (uncached, byte, pAddr, vAddr, DATA) 
GPR[rt]e3..0 — Memquad(7+8:byte+64doubl eword)..(64*doubleword) || GP R[rt] (55-8byte)..0 


Given a doubleword in a register and a doubleword in memory, the operation of LDL is as 
follows: 
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LDL 
MSB 63 0 LSB 


Register eo PS es eS ie 


Little-endian 15 14 13 12 11 10 


9 8 7 6 5 4 3 2 1 #0 
sc a cc Ln A 


Little-endian byte ordering (BigEndianCPU = 0) 
vAddrs..o Destination register contents after instruction(shaded is unchanged) Type 


15 


- 
- 
- 


; 

- 
-|= 
ce 
| 
<* 
-|- 
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LDL 


MSB 63 0 LSB 


Register [ee Pee es ec] 

Big-endian 0 14 2 3 4 5 6 7 8 9 10 11 12 13 14 15 

i MO cc ee 
13 12 10 9 8 7 6 5 4 3 2 1 0 


Little-endian 15 14 11 


Big-endian byte ordering (BigEndianCPU = 0) 


vAddrs..o Destination register contents after instruction(shaded is unchanged) 


LEM 


<l/el</cl/aAl/ol|Di Oo; |v o;/Z\/B2i/rjxlte 

o/<x/Sli<j/c}/Alwo;/Dilol/vl/OoO;/Z\/B /r{/xAse 
alfajal/x;/sSi/</cjaA;/asj/asja/v/o;/2/Z2/r 
sy | sr |e sr | = | es | = Beal = oe] Ss |= |= | = |] = ike 
CoO} @O;}] WO] @M;]@Os]@oO!]@al!l ay o;yo;o;ro;o;o;oao;]o 


ealalalalala|x|Siael/alalalsala|vio 


LEM Littleendian memory (BigEndian =0) 
BEM BigEndian =1 
Type AccessL ength sent to memory 
Offset pAddr3..o sent to memory 
Exceptions: 
TLB Refill 
TLB Invalid 


Address Error 


Programming Notes: 


None 
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LDR Load Doubleword Right LDR 


31 26 25 21 20 16 15 0 
LDR 
6 5 5 16 
MIPS III 
Format: LDR rt, offset (base) 
Purpose: To load the less-significant part of a doubleword from an unaligned memory address. 
Description: rt ~ rt MERGE memory [base + offset] 


Paired LDL and LDR instructions are used to load a register with a doubleword from 
eight consecutive bytes in memory starting at an arbitrary byte address. LDL loads the 
left (most-significant) bytes and LDR loads the right (least-significant) bytes. 


The instruction adds the 16-bit signed offset to the contents of GPR base to form the 
effective address. This is the address of the least-significant bytes of a doubleword 
composed of eight consecutive bytes in memory. LDR loads from one to eight bytes, the 
least-significant bytes of the doubleword, into the corresponding bytes of GPR rt. It loads 
the bytes that are in the target doubleword that are also in the aligned doubleword which 
contains the byte specified by the effective address. 


Conceptually, it starts at the specified byte in memory and loads that byte into the low- 
order (right-most) byte of the register; then it loads bytes from memory into the register 
until it reaches the high-order byte of the doubleword in memory. The most significant 
(left-most) byte (s) of the register will not be changed. 


memory 


(little-endian) , 
register 


address 0 }7/6|5|4]3]2]1/ 0) peels fHielFlelo}c{e lal $24 


LDR $24,4 ($0) register 


anor (H[S[FE[7 [6] [4] os 


memory 


big-endi 
ve eian register 


dd 8 
se ote ee et Ll swe [aT 5]¢ [0] [F[ Sa] s 
LDR $24,4 ($0) register 


aro [A]8 [0] [2] 3] 4] s2 


The contents of GPR rt are internally bypassed within the processor so that no NOP is 
needed between an immediately preceding load instruction which specifies register rt and 
a following LDR (or LDL) instruction which also specifies register rt. 
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No address exceptions due to alignment are possible. 


Restrictions: 


None 


Operation: (128-bit bus) 
vAddr < sign_extend(offset) +GPR[base] 31..0 
(pAddr, uncached) < AddressTranslation (vAddr, DATA, LOAD) 
pAddr < pAddrvesize-1)..00 || (PAddr3..o xor BigE ndian‘) 
if (BigEndian =1) then 
pAddr <— pAddresize-1).3 || 07 
endif 
byte — 0 || (vAddrz..0 xor BigEndian?) 
doubleword <— vAddrs xor BigE ndian 
memquad < LoadMemory (uncached, byte, pAddr, vAddr, DATA) 
GPR[rt]e3..0 <— GPR[rt] 63.(64-8+byte) || MeMQUAad (63-+64:doubl eword).. (64doubleword+8+byte) 


Given a doubleword in a register and a doubleword in memory, the operation of LDR is as 
follows: 
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LDR 
MSB 63 0 LSB 


Resistr rare TeTate[ fel] 
Litle-endian 15 14 13 12 11 


Menor Cis DT sopra aps] tu w pe 


Little-endian byte ordering (BigEndianCPU = 0) 
yAdaracs eae : F : P 


(=) 


Oo 


Ww] Pp 


ase 
a 
as 


oa 
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LDR 


MSB 63 0 LSB 


Register eed ee eure) 


Big-endian 0 1 2 3 4 #5 6 7 8 9 10 11 12 13 #14 #15 
oo MUIR ES ERIE EAC ICSETRALIEICIES 
7 


Little-endian 15 14 11 


Big-endian byte ordering (BigEndianCPU = 1) 


Zl</cl/Alon!/D/O;/@el]o/;2)/ez}rl/ajye 
<)el<|/cCc]}]Alo}] Di oOo; vl o;Z2z/2z}ryxtye 
oO}1}o;o;o;o;}ro;o;oy;yoyjoyoy;o;}oyoy;yo;];o 


D/O! ol /ojo;ojyojyojyc 
4|/ni ivi /oOl;elselsjeleslr;|xAlc 


LEM Littleendian memory (BigEndianMem =0) 
BEM BigEndianMem =1 
Type AccessL ength sent to memory 
Offset pAddrz..0sent to memory 
Exceptions: 
TLB Refill 
TLB Invalid 


Address Error 


Programming Notes: 


None 
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LH Load Halfword LH 


31 26 25 21 20 16 15 0 
LH 
r 
6 5 5 16 
MIPS | 

Format: LH rt, offset (base) 
Purpose: To load a halfword from memory as a signed value. 
Description: rt — memory [base + offset] 


The contents of the 16-bit halfword at the memory location specified by the aligned 
effective address are fetched, sign-extended, and placed in GPR rt. The 16-bit signed offset 
is added to the contents of GPR base to form the effective address. 


Restrictions: 


The effective address must be naturally aligned. If the least-significant bit of the address 
is non-zero, an Address Error exception occurs. 


Operation: (128-bit bus) 
vAddr < sign_extend (offset) +GPR[base] 31..0 
if (vAddro) # 0 then SignalE xception (AddressE rror) endif 
(pAddr, uncached) « AddressTranslation (vAddr, DATA, LOAD) 
pAddr < pAddresize-1)..4 || (pAddr3..o xor (BigE ndian3|| 0)) 
memquad <- LoadMemory (uncached, HALF WORD, pAddr, vAddr, DATA) 
byte — vAddrs.o xor (BigEndian?|| 0) 
GPR[rt]e3..0 << sign_extend (memquad15+8:byte)..8«byte) 
Exceptions: 


TLB Refill 
TLB Invalid 
Address Error 


Programming Notes: 


None 
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LHU Load Halfword Unsigned LHU 


31 26 25 21 20 16 15 0 
LHU 
6 5 5 16 
MIPS | 
Format: LHU rt, offset (base) 
Purpose: To load a halfword from memory as an unsigned value. 
Description: rt — memory [base + offset] 


The contents of the 16-bit halfword at the memory location specified by the aligned 
effective address are fetched, zero-extended, and placed in GPR rt. The 16-bit signed offset 
is added to the contents of GPR base to form the effective address. 


Restrictions: 


The effective address must be naturally aligned. If the least-significant bit of the address 
is non-zero, an Address Error exception occurs. 


Operation: (128-bit bus) 
vAddr < sign_extend (offset) +GPR [base] 31..0 
if (vAddro) # 0 then SignalE xception (AddressE rror) endif 
(pAddr, uncached) « AddressTranslation (vAddr, DATA, LOAD) 
pAddr <— pAddreesize-1)..4 || (pAddrs3..o xor (BigE ndian3|| 0)) 
memquad < LoadMemory (uncached, HALF WORD, pAddr, vAddr, DATA) 
byte — vAddrs.o xor (BigEndian?|| 0) 
GPR [rt]e3..0 < zero_extend (memquad(15+8:byte)..8«byte) 
Exceptions: 


TLB Refill 
TLB Invalid 
Address Error 


Programming Notes: 


None 
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LUI Load Upper Immediate LUI 


31 26 25 21 20 16 15 0 
001111 00000 
6 5 5 16 
MIPS | 
Format: LUI rt, immediate 
Purpose: To load a constant into the upper half of a word. 
Description: rt — immediate || 0'° 


The 16-bit immediate is shifted left 16 bits and concatenated with 16 bits of low-order 
zeros. The 32-bit result is sign-extended and placed into GPR rt. 


Restrictions: 
None 


Operation: 
GPR [rt] 63.0 < sign_extend (immediate || 01°) 
Exceptions: 


None 


Programming Notes: 


None 
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LW Load Word LW 


31 26 25 21 20 16 15 0 
LW 
6 5 5 16 
MIPS | 
Format: LW rt, offset (base) 
Purpose: To load a word from memory as a signed value. 
Description: rt — memory [base + offset] 


The contents of the 32-bit word at the memory location specified by the aligned effective 
address are fetched, sign-extended to the GPR register length if necessary, and placed in 
GPR rt. The 16-bit signed offset is added to the contents of GPR base to form the effective 
address. 


Restrictions: 


The effective address must be naturally aligned. If either of the two least-significant bits 
of the address are non-zero, an Address Error exception occurs. 


Operation: (128-bit bus) 
vAddr < sign_extend (offset) +GPR [base] 31..0 
if (vAddri..0) # 02 then SignalE xception (AddressE rror) endif 
(pAddr, uncached) « AddressTranslation (vAddr, DATA, LOAD) 
pAddr < pAddreesize-1)..4|| (pAddr3..o xor (BigE ndian?|| 02)) 
memquad <- LoadMemory (uncached, WORD, pAddr, vAddr, DATA) 
byte — vAddrs.o xor (BigE ndian?|| 02) 
GPR [rt]e3.0 < sign_extend (memquad.314s:byte)..8byte) 
Exceptions: 


TLB Refill 
TLB Invalid 
Address Error 


Programming Notes: 


None 
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LWL Load Word Left LWL 


31 26 25 21 20 16 15 0 
LWL 
r 
6 5 5 16 
MIPS | 

Format: LWL rt, offset (base) 

Purpose: To load the more-significant part of a word from an unaligned memory address as a 

signed value. 
Description: rt ~ rt MERGE memory [base + offset] 


Paired LWL and LWR instructions are used to load a register with a word from four 
consecutive bytes in memory starting at an arbitrary byte address. LWL loads the left 
(most-significant) bytes and LWR loads the right (least-significant) bytes. 


The instruction adds the 16-bit signed offset to the contents of GPR base to form the effective 
address. This is the address of the most-significant byte of a word composed of four consecutive 
bytes in memory. LWL loads from one to four bytes, the most-significant bytes of the word, 
into the corresponding bytes of GPR rt. It loads the bytes that are in the target word that are 
also in the aligned word which contains the byte specified by the effective address. 


Bit 31 of the register is loaded so the loaded word is sign-extended. 


Conceptually, it starts at the specified byte in memory and loads that byte into the high- 
order (left-most) byte of the register; then it loads bytes from memory into the register 
until it reaches the low-order byte of the word in memory. The least-significant (right- 
most) byte(s) of the register will not be changed. 


memory 
(little-endian) 


register 
address 4 7 fe | 5. 
0 


address before 


LWL $24,4 ($0) 
after 


register 
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The contents of GPR rt are internally bypassed within the processor so that no NOP is 
needed between an immediately preceding load instruction which specifies register rt and 
a following LWL (or LWR) instruction which also specifies register rt. 


No address exceptions due to alignment are possible. 


Restrictions: 


None 


Operation: (128-bit bus) 
vAddr < sign_extend (offset) +GPR [base] 31..0 
(pAddr, uncached) < AddressTranslation (vAddr, DATA, LOAD) 
pAddr <— pAddreesize-1).4 || (pAddrs..o xor BigE ndian‘) 
if (BigEndian =O) then 
pAddresize-1)..3 || 02 
endif 
byte — 0? || (vAddr1..0 xor BigEndian2) 
word < vAddr3.2 xor BigEndian2 
memquad <« LoadMemory (uncached, byte, pAddr, vAddr, DATA) 
temp <— memaquad(32:word-+8+byte+7)..32word || GPR [rt] (23-8+byte)..0 
GPR [rt] 63.0 < (temps1)22 || temp 


Given a doubleword in a register and a doubleword in memory, the operation of LWL is as 
follows: 
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LWL 
MSB 63 0 LSB 


Register Eel es din) 


Little-endian 15 14 13 11. 10 


12 9 8 7 6 5 4 3 2 1 0 
Memory Lt] y{K{e[ MN] ol] Pla} ri s{ tiul vi wl x 


Little-endian byte ordering (BigEndianCPU = 0) 
vAddrs..0 


0 Sign bit(31) extended 4 f g h 0 0 15 
1 Sign bit(31) extended WwW X g h 1 0 14 
2 Sign bit(31) extended Vv WwW X h 2 0 13 
3 Sign bit(31) extended U Vv WwW » 4 3 0 12 
4 Sign bit(31) extended T f g h 0 4 11 
5 Sign bit(31) extended Ss T g h 1 4 10 
6 Sign bit(31) extended R Ss T h 2 4 9 
us (31) Q R s T 3 4 8 
8 (31) P g h 0 8 7 
9 (31) fe) g h 8 6 
(31) N P h 8 5 
(31) M oO P 4 
(31) L g h 3 
(31) K g h 2 
(31) J L h 
(31) I K L 
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LWL 
MSB 63 0 LSB 


Register ee ae eae ee 


Big-endian 0 12 


1 2 3 4 #5 6 7 8 9 10 11 13 14 15 
eee REISE CIEIEEACIEIE TEI ESEALIES 


Little-endian 15 14 11 3 


Big-endian byte ordering (BigEndianCPU = 1) 


vAddrs..o Destination register contents after instruction(shaded is unchanged) 


0 Sign bit(31) extended I J K L 3 12 0 
1 Sign bit(31) extended J K L h 2 12 1 
2 Sign bit(31) extended K L g h 1 12 2 
3 Sign bit(31) extended L f g h 0 12 3 
4 Sign bit(31) extended M N oO P 3 8 4 
5 Sign bit(31) extended N oO P h 2 8 5 
6 Sign bit(31) extended oO P g h 1 8 6 
7 (31) P g h 8 7 
8 (31) Q S T 4 8 
9 (31) R T h 4 9 

(31) s g h 4 

(31) T g h 4 

(31) U Ww 4 0 

(31) V 4 h ) 

(31) W g h ) 

(31) x g h ) 


LEM Littleendian memory (BigEndianMem =0) 
BEM BigEndianMem =1 
Type AccessL ength sent to memory 
Offset pAddrz..0 sent to memory 
Exceptions: 
TLB Refill 
TLB Invalid 


Address Error 


Programming Notes: 


The architecture provides no direct support for treating unaligned words as unsigned 
values, i.e. zeroing bits 63..32 of the destination register when bit 31 is loaded. See SLL or 
SLLV for a single-instruction method of propagating the word sign bit in a register into 
the upper half of a 64-bit register. 
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LWR Load Word Right LWR 


31 26 25 21 20 16 15 0 
LWR 
6 5 5 16 

MIPS | 

Format: LWR rt, offset (base) 

Purpose: To load the less-significant part of a word from an unaligned memory address as a signed 

value. 
Description: rt ~ rt MERGE memory [base + offset] 


Paired LWL and LWR instructions are used to load a register with a word from four 
consecutive bytes in memory starting at an arbitrary byte address. LWL loads the left 
(most-significant) bytes and LWR loads the right (least-significant) bytes. 


The instruction adds the 16-bit signed offset to the contents of GPR base to form the effective 
address. This is the address of the least-significant byte of a word composed of four consecutive 
bytes in memory. LWR loads from one to four bytes, the least-significant bytes of the word, 
into the corresponding bytes of GPR rt. It loads the bytes that are in the target word that are 
also in the aligned word which contains the byte specified by the effective address. 


If the word sign bit (bit 31) is loaded from memory into the register by the instruction, 
then the loaded word is sign-extended. If the sign bit is not loaded from memory by the 
LWR, then bits 63..32 of the destination are unchanged. 


Conceptually, it starts at the specified byte in memory and loads that byte into the low- 
order (right-most) byte of the register; then it loads bytes from memory into the register 
until it reaches the high-order byte of the word in memory. The most significant (left- 
most) byte(s) of the register will not be changed. 


memory 
(little-endian) 


register 
mane f 6] 5. before PARES 


address 


LWR $24,1 ($0) register 
after 
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memory 
(big-endian) 


register 
address 4 


address 0 


The contents of GPR rt are internally bypassed within the processor so that no NOP is 
needed between an immediately preceding load instruction which specifies register rt and 
a following LWR (or LWL) instruction which also specifies register rt. 


No address exceptions due to alignment are possible. 


Restrictions: 


None 


Operation: (128-bit bus) 
vAddr < sign_extend (offset) +GPR [base]s1..0 
(pAddr, uncached) < AddressTranslation (vAddr, DATA, LOAD) 
pAddr <— pAddrvesize-1)..4 || (ODAddrs..o xor BigE ndian‘*) 
if (BigEndian =1) then 
pAddr psize-31)..3 || 07 
endif 
byte <— 0|| (vAddr1..o xor BigEndian2) 
word <— vAddr3.2 xor BigEndian2 
memquad < LoadMemory (uncached, byte, pAddr, vAddr, DATA) 
temp <— GPR [rt]s1.. (32-8:byte) || MeMQuad(31+432:word).. (32+word+8«byte) 
if (byte =4) then 
utemp < (temp31)22 /= loaded bit 31, must sign extend #/ 
else 
one of the following two behaviors: 
utemp < GPR [rt]e3..32 /« leave what was there alone »/ 
utemp < (GPR [rt]31)32* sign-extend bit 31 »/ 
endif 
GPR [rt] 63..0 <— utemp || temp 


Given a word in a register and a word in memory, the operation of LWR is as follows: 
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TOSHIBA 


LWR 


Register 


Little-endi 


12 
memory [ITS [ KET M[N 


vAddrs..o 


MSB 63 
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0 LSB 


Peles ea an) 


an 15 14 13 11 


Little-endian byte ordering (BigEndianCPU = 0) 


10 


9 8 7 6 5 4 3 2 1 #0 
Ri Riri ka cies 


0 Sign bit (31) extended e f g I 0 15 0 
1 Sign bit (31) extended or unchanged e f I J 1 14 0 
2 Sign bit (31) extended or unchanged @ I J K 2 13 0 
3 Sign bit (31) extended or unchanged I J K L 3 12 0 
4 Sign bit (31) extended e f g M 0 11 4 
5 Sign bit (31) extended or unchanged e f M N 1 10 4 
6 Sign bit (31) extended or unchanged e M N oO 2 9 4 
7 Sign bit (31) extended or unchanged N oO P 3 8 4 
8 Sign bit (31) extended g Q 0 7 8 
9 Sign bit (31) extended or unchanged Q R 6 8 

Sign bit (31) extended or unchanged R Ss 5 8 

Sign bit (31) extended or unchanged Ss T 4 

Sign bit (31) extended g U 3 

Sign bit (31) extended or unchanged U Vv 2 

Sign bit (31) extended or unchanged Vv Ww 

Sign bit (31) extended or unchanged Ww X 
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LWR 


MSB 63 0 LSB 


Register Pee e ye ees en) 


Big-endian 0 1 2 3 4 #5 6 7 8 9 10 11 12 13 14 #15 
pea EAEAESESC ICICI TRALIEICIES 
7 


Litle-endian 15 14 11 3 


Big-endian byte ordering (BigEndianCPU = 1) 


<l/el<|/cl/Alol|D Oo; vi Oo;Z;/BZi/rsj/xAlte 


Zi\<|/cljei/wn/DWi Ol/@e/o;/Z\/Zs/e|;|xAlse 


LEM Littleendian memory (BigEndian =0) 
BEM BigEndianMem =1 
Type AccessL ength sent to memory 
Offset pAddrz..0 sent to memory 
Exceptions: 
TLB Refill 
TLB Invalid 


Address Error 


Programming Notes: 
The architecture provides no direct support for treating unaligned words as unsigned 
values, i.e. zeroing bits 63..32 of the destination register when bit 31 is loaded. See SLL or 


SLLV for a single-instruction method of propagating the word sign bit in a register into 
the upper half of a 64-bit register. 
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LWU Load Word Unsigned LWU 


31 26 25 21 20 16 15 0 
LWU 
6 5 5 16 
MIPS III 
Format: LWU rt, offset (base) 
Purpose: To load a word from memory as an unsigned value. 
Description: rt — memory [base + offset] 


The contents of the 32-bit word at the memory location specified by the aligned effective 
address are fetched, zero-extended, and placed in GPR rt. The 16-bit signed offset is added 
to the contents of GPR base to form the effective address. 


Restrictions: 


The effective address must be naturally aligned. If either of the two least-significant bits 
of the address are non-zero, an Address Error Exception occurs. 
Operation: (128-bit bus) 
vAddr < sign_extend (offset) +GPR [base] 31..0 
if (vAddri..0) # 02 then SignalE xception (AddressE rror) endif 
(pAddr, uncached) « AddressTranslation (vAddr, DATA, LOAD) 
pAddr < pAddreesize-1)..4 || (PAddrs..o xor (BigE ndian? || 02)) 
memquad <- LoadMemory (uncached, WORD, pAddr, vAddr, DATA) 
byte — vAddrs.o xor (BigE ndian2 || 02) 
GPR [rt] 63.0 < 0% || Memquad(3148:byte)..8+byte 
Exceptions: 


TLB Refill 
TLB Invalid 
Address Error 


Programming Notes: 


None 
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MFHI Move from HI Register MFHI 


31 26 25 16 15 11 10 6 5 0 
SPECIAL 0 0 MFHI 
6 10 5 5 6 
MIPS | 
Format: MFHI rd 
Purpose: To copy the special purpose HI register to a GPR. 
Description: rd <— HI 


The contents of special register H/ are loaded into GPR rd. 
Restrictions: 
None 


Operation: 
GPR [rd]e3..0 << Hles..0 
Exceptions: 


None 


Programming Notes: 


No restriction is needed because C790 has an interlock mechanism for MULT or DIV 
instructions. 
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M FLO Move from LO Register M FLO 
31 26 25 16 15 11 10 6 5 0 
SPECIAL 0 ‘a 0 MFLO 
000000 00 0000 0000 00000 010010 
6 10 5 5 6 
MIPS | 
Format: MFLO rd 
Purpose: To copy the special purpose LO register to a GPR. 
Description: rd<LO 


The contents of special register LO are loaded into GPR rd. 
Restrictions: 
None 


Operation: 
GPR [rd] 63..0 < LO6es..0 
Exceptions: 


None 


Programming Notes: 


(Same as MFHI) 
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men Move Conditional on Not Zero ae 


21 20 16 15 11 10 
amare Ts Ls [de | oa 
ae 00000 001 Md 1 
MIPS IV 
Format: MOVN ‘rd, rs, rt 
Purpose: To conditionally move a GPR after testing a GPR value. 
Description: if (rt #0) then rd < rs 


If the value in GPR rt is not equal to zero, then the contents of GPR rs are placed into 
GPR rd. 


Restrictions: 


None 


Operation: 


if GPR [rt] 63.04 0 then 
GPR [rd] 63.0 < GPR [rs] 63.0 
endif 


Exceptions: 


None 


Programming Notes: 


The nonzero value tested here is the “condition true” result from the SLT, SLTI, SLTU, 
and SLTIU comparison instructions. 
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bieaihes Move Conditional on Zero pee 


21 20 16 15 11 10 
amare Ts Ls [de | am 
ae 00000 001 = 0 
MIPS IV 
Format: MOVZ rd, rs, rt 
Purpose: To conditionally move a GPR after testing a GPR value. 
Description: if (rt = 0) then rd< rs 


If the value in GPR rt is equal to zero, then the contents of GPR rs are placed into GPR rd. 


Restrictions: 


None 


Operation: 


if GPR [rt] 63.0 =O then 
GPR [rd] 63.0 < GPR [rs] 63..0 
endif 


Exceptions: 


None 


Programming Notes: 


The zero value tested here is the “condition false” result from the SLT, SLTI, SLTU, and 
SLTIU comparison instructions. 
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MTHI Move to HI Register MTHI 


31 26 25 21 20 65 0 


SPECIAL a 0 MTHI 
000000 000 0000 0000 0000 010001 


6 5 15 6 


MIPS | 
Format: MTHI rs 
Purpose: To copy a GPR to the special purpose HI register. 
Description: Hl < rs 
The contents of GPR rs are loaded into special register H/. 
Restrictions: 
None 


Operation: 
Hl63.0< GPR [rs] 63.0 
Exceptions: 


None 


Programming Notes: 


None 
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MTLO Move to LO Register MTLO 


31 26 25 21 20 6 5 0 
000000 000 0000 0000 0000 010011 
6 5 15 6 
MIPS | 
Format: MTLO rs 
Purpose: To copy a GPR to the special purpose LO register. 
Description: LO< rs 


The contents of GPR rs are loaded into special register LO. 
Restrictions: 
None 


Operation: 
L Oe63..0 < GPR [rs] 63.0 
Exceptions: 


None 


Programming Notes: 


None 


A-85 


TX 
TOSHIBA Appendix A CPU Instruction Set Details SE 


MULT Multiply Word MULT 


31 26 25 21 20 16 15 65 0 
SPECIAL 0 MULT 
6 5 5 10 6 
MIPS | 
Format: MULT rs, rt 
Purpose: To multiply 32-bit signed integers. 
Description: (LO, HI) < rs x rt 


The 32-bit word value in GPR rt is multiplied by the 32-bit value in GPR rs, treating both 
operands as signed values, to produce a 64-bit result. The low-order 32-bit word of the 
result is placed into special register LO, and the high-order 32-bit word is placed into 
special register H/. 


No arithmetic exception occurs under any circumstances. 

Restrictions: 
If either GPR rt or GPR rs do not contain sign-extended 32-bit values (bits 63..31 equal), 
then the result of the operation is undefined. 


Operation: 
if (NotWordValue (GPR [rs]) or NotWordValue (GPR [rt])) then UndefinedResult() endif 
prod < GPR [rs]s1..0 * GPR [rt]s1..0 
L O¢3..0 < (prod 31) || prodsz..o 
H1 63.0 < (prod 63)? || prodes..32 
Exceptions: 


None 


Programming Notes: 


In the C790, the integer multiply operation proceeds asynchronously and allows other 
CPU instructions to execute before it is retired. An attempt to read LO or H/ before the 
results are written will wait (interlock) until the results are ready. Asynchronous 
execution does not affect the program result, but offers an opportunity for performance 
improvement by scheduling the multiply so that other instructions can execute in parallel. 


Programs that require overflow detection must check for it explicitly. 
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MULTU Multiply Unsigned Word MULTU 


31 26 25 21 20 16 15 6 5 0 
000000 00 0000 0000 011001 
6 5 5 10 6 
MIPS | 
Format: MULTU rs, rt 
Purpose: To multiply 32-bit unsigned integers. 
Description: (LO, HI) < rs x rt 


The 32-bit word value in GPR rt is multiplied by the 32-bit value in GPR rs, treating both 
operands as unsigned values, to produce a 64-bit result. The low-order 32-bit word of the 
result is placed into special register LO, and the high-order 32-bit word is placed into 
special register H/. 
No arithmetic exception occurs under any circumstances. 

Restrictions: 
If either GPR rt or GPR rs do not contain sign-extended 32-bit values (bits 63..31 equal), 
then the result of the operation is undefined. 


Operation: 


if (NotWordValue (GPR [rs]) or NotWordValue (GPR [rt])) then UndefinedResult() endif 
prod < (0 || GPR [rs]s1..0 ) * (O || GPR [rt]s1..0) 
L O¢3..0 < (prod 31) || prodsz..o 
H1 63.0 < (prod 63)? || prodes..32 
Exceptions: 


None 


Programming Notes: 


See the Programming Notes for the MULT instruction. 
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mor Not Or oe 


21 20 16 15 11 10 
Eres 
ae 00000 so 11 
MIPS | 
Format: NOR rd, rs, rt 
Purpose: To do a bitwise logical NOT OR. 
Description: rd < rs NOR rt 


The contents of GPR rs are combined with the contents of GPR rtin a bitwise logical NOR 
operation. The result is placed into GPR rd. 


Restrictions: 
None 


Operation: 
GPR [rd] 63.0 < GPR [rs] 63.0 nor GPR [rt] 63.0 
Exceptions: 


None 


Programming Notes: 


None 
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- Or OR 


21 20 16 15 11 10 
Eases 
ae 00000 1 su 01 
MIPS | 
Format: OR ‘fd, rs, rt 
Purpose: To do a bitwise logical OR. 
Description: rd<rs ORrt 


The contents of GPR rs are combined with the contents of GPR rt in a bitwise logical OR 
operation. The result is placed into GPR rd. 


Restrictions: 
None 


Operation: 
GPR [rd] 63.0 < GPR [rs] 63.0 or GPR [rt] 63.0 
Exceptions: 


None 


Programming Notes: 


None 
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ORI Or Immediate ORI 


31 26 25 21 20 16 15 0 
ORI : : 
6 5 5 16 
MIPS | 
Format: ORI rt, rs, immediate 
Purpose: To do a bitwise logical OR with a constant. 
Description: rt ~ rs OR immediate 


The 16-bit immediate is zero-extended to the left and combined with the contents of GPR 
rsin a bitwise logical OR operation. The result is placed into GPR rt. 


Restrictions: 
None 


Operation: 
GPR [rt]6e3.0 < zero_extend (immediate) or GPR [rs] 63..0 
Exceptions: 


None 


Programming Notes: 


None 
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PREF Prefetch PREF 


31 26 25 21 20 16 15 0 
PREF i 
6 5 5 16 
MIPS IV 
Format: PREF hint, offset (base) 
Purpose: To prefetch data from memory. 
Description: prefetch_memory (base+offset) 


PREF adds the 16-bit signed offset to the contents of GPR base to form an effective byte 
address. It advises that data at the effective address may be used in the near future. 


If the hint field is 000002, this instruction prefetches a block of data from main memory 
into cache. 


PREF is an advisory instruction. It may change the performance of the program. For all 
hint values and all effective addresses, it neither changes architecturally-visible state nor 
alters the meaning of the program. 


PREF does not cause addressing-related exceptions. If it raises an exception condition, the 
exception conditions ignored. If an addressing-related exception condition is raised and 
ignored, no data will be prefetched, Even if no data is prefetched in such a case, some 
action that is not architecturally-visible, such as writeback of a dirty cache line, might 
take place. 


PREF will never generate a memory operation for a location with an uncached memory 
access type. 


The defined hint values are shown in the table below. The C790 only supports hint = 0. 
The hint table may be extended in future implementations. 


Values of hint field for prefetch instruction 


Data use and desired prefetch action 


Data is expected to be loaded (not modified). 
Fetch data as if for a load. 


(Reserved) (Reserved) 
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Restrictions: 


None 


Operation: 


vAddr < sign_extend (offset) +GPR [base] 31.0 
(pAddr, uncached) « AddressTranslation (vAddr, DATA, LOAD) 
Prefetch (uncached, pAddr, vAddr, DATA, hint) 


Exceptions: 
None 
Programming Notes: 


Prefetch can not prefetch data from a mapped location unless the translation for that 
location is present in the TLB. Locations in memory pages that have not been accessed 
recently may not have translations in the TLB, so prefetch may not be effective for such 
locations. 


Prefetch on C790 may not prefetch data when there is outstanding bus read process due to 
a data cache miss, an uncached load or a miss on the uncached accelerated buffer. 


Prefetch does not cause addressing exceptions. It will not cause an exception to prefetch 
using an address pointer value before the validity of a pointer determined. 


Implementation Notes: 


A reserved hint field value causes a default prefetch action, the load hint. 
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S B Store Byte S B 


31 26 25 21 20 16 15 0 
SB 
r 
6 5 5 16 
MIPS | 

Format: SB rt, offset (base) 
Purpose: To store a byte to memory. 
Description: memory [base + offset] < rt 


The least-significant 8-bit byte of GPR rt is stored in memory at the location specified by 
the effective address. The 16-bit signed offset is added to the contents of GPR base to form 
the effective address. 


Restrictions: 


None 


Operation: (128-bit bus) 

vAddr < sign_extend (offset) +GPR [base] 31..0 

(pAddr, uncached) « AddressTranslation (vAddr, DATA, STORE) 

pAddr <— pAddreesize-1)..4 || (PAddrs..o xor BigE ndian*) 

byte — vAddr3..o xor BigE ndian* 

dataquad <— GPR [rt] (127-8byte)..0 || O8*byte 

StoreMemory (uncached, BYTE, dataquad, pAddr, vAddr, DATA) 
Exceptions: 


TLB Refill 
TLB Invalid 
TLB Modified 
Address Error 


Programming Notes: 


None 
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SD Store Doubleword SD 


31 26 25 21 20 16 15 0 
SD 
6 5 5 16 
MIPS Ill 
Format: SD rt, offset (base) 
Purpose: To store a doubleword to memory. 
Description: memory [base + offset] < rt 


The 64-bit doubleword in GPR rt is stored in memory at the location specified by the 
aligned effective address. The 16-bit signed offset is added to the contents of GPR base to 
form the effective address. 


Restrictions: 


The effective address must be naturally aligned. If any of the three least-significant bits of 

the effective address are non-zero, an Address Error exception occurs. 
Operation: (128-bit bus) 

vAddr < sign_extend (offset) +GPR [base] 31..0 

if (vAddrz..0) 03 then SignalE xception (AddressE rror) endif 

(pAddr, uncached) « AddressTranslation (vAddr, DATA, STORE) 

pAddr < pAddreesize-1)..4 || (PAddrs..o xor (BigE ndian || 03)) 

byte — vAddr3.o || (BigEndian || 0?) 

dataquad < GPR [rt] (127-8byte)..0 || O8*yte 

StoreMemory (uncached, DOUBLEWORD, dataquad, pAddr, vAddr, DATA) 
Exceptions: 


TLB Refill 
TLB Invalid 
TLB Modified 
Address Error 


Programming Notes: 


None 
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SDL Store Doubleword Left SDL 


31 26 25 21 20 16 15 0 


SDL 
6 5 5 


16 


MIPS III 
Format: SDL ft, offset (base) 


Purpose: To store the more-significant part of a doubleword to an unaligned memory 
address. 


Description: memory [base + offset] < rt 


Paired SDL and SDR instructions are used to store a doubleword from a register into 
eight consecutive bytes in memory starting at an arbitrary byte address. SDL stores the 
left (most-significant) bytes and SDR stores the right (least-significant) bytes. 


The 16-bit signed offset is added to the contents of GPR base to form the effective address 
of the most-significant byte of the contiguous doubleword in memory. It alters only the 
doubleword in memory which contains that byte. From one to eight bytes will be stored, 
depending on the starting byte specified. 


Conceptually, it starts at the most-significant byte of the register and copies it to the 
specified byte in memory; then it copies bytes from register to memory until it reaches the 
low-order byte of the word in memory. 


No address exceptions due to alignment are possible. 


memory 


(little-endian) . 
register 


dd 8 
sree tel ett tLe seve Cals [ele[o le] sla] se 


SDL $24,10 ($0) 


address 8 15/14] 13/12/11] H |G F 
ral 


ca after 
address 0 [7/6| 5] 4] 3] 2| 0 | 
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memory 
(little-endian) 


register 


dd 8 
wares ete tele letit oe (ale[e[o[e[ Fle] H] 


SDL $24,1 ($0) 


address 8 8 [9 |10 [14/12 [13/14 [15 after 
address 0 fofalBi[c[D]/elFia| 


Restrictions: 


None 


Operation: (128-bit bus) 
vAddr < sign_extend (offset) +GPR [base] 31..0 
(pAddr, uncached) « AddressTranslation (vAddr, DATA, STORE) 
pAddr < pAddrvesize-1)..4 || (PAddr3..o xor BigE ndian‘) 
If (BigEndian =O) then 
pAddr <— pAddresize-1).3 || 07 
endif 
byte — 0 || (vAddrz..0 xor BigEndian?) 
if (vAddr3 xor BigEndian =O) then 
dataquad < 0% || 068*byte) || GPR [rt] 63. (56-8:byte) 
else 
dataquad < 0(568*byte) || GPR [rt]e3.. (56-8byte) || 0% 
endif 
StoreM emory (uncached, byte, dataquad, pAddr, vAddr, DATA) 


Given a doubleword in a register and a doubleword in memory, the operation of SDL is as 
follows: 
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SDL 
MSB 63 0 LSB 


Register oe Ee a 


Little-endian 15 14 13 12 11 


10 9 8 7 6 5 4 3 2 1 0 
Mewrcry [GE RS MS 


Little-endian byte ordering (BigEndianCPU = 1) 


ZT iO; nT; ms o;a;nvay\ ye 
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ar 


a4 
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4 
(oe) 


a 
A 


O;}W)/ RPI rl rl xl ral w~nl_ wal rwalyr wl wl wl wl ayn 
mo} © | 0 | © | > (Rs 
oe ee ol oll oll oll iol loll loll ie) 
fe} || fey || Ze} || fey || tex || xe) || ei || vey BAR ve) || vey || vey || voy |] ve) | rey | ce 
Sl/SlrSel tes Seles FEs/Ez/O}/T/M/O0/O|;|a\ pie 
ClOl/COl/COl/CO}lQ}/CQO}C}] aM] @a]a!]a]a!]a}]a!a 
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SDL 


MSB 63 0 LSB 


Register OSs ea 


Big-endian 2 3 4 5 6 7 8 9 10 11 12 13 14 #15 
Memory 
Litle-endian 15 14 13 12 11 10 9 8 7 6 5 4 3 2 


Big-endian byte ordering (BigEndianCPU = 0) 


oO 
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Q/;al;ajal;jaj;yaj;ya;psyang;ya;ya;yagy;asya;a;sa 


LEM Littleendian memory (BigEndianMem =0) 
BEM BigEndianMem =1 
Type AccessL ength sent to memory 
Offset pAddr3..o sent to memory 
Exceptions: 
TLB Refill 
TLB Invalid 
TLB Modified 


Address Error 


Programming Notes: 


None 
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SDR Store Doubleword Right SDR 


31 26 25 21 20 16 15 0 
SDR 
6 5 5 16 
MIPS III 
Format: SDR rt, offset (base) 
Purpose: To store the less-significant part of a doubleword to an unaligned memory address. 
Description: memory [base + offset] < rt 


Paired SDL and SDR instructions are used to store a doubleword from a register into 
eight consecutive bytes in memory starting at an arbitrary byte address. SDL stores the 
left (most-significant) bytes and SDR stores the right (least-significant) bytes. 


The SDR instruction adds its sign-extended 16-bit offset to the contents of GPR base to 
form an effective address which may specify an arbitrary byte It alters only the 
doubleword in memory which contains that byte. From one to eight bytes will be stored, 
depending on the starting byte specified. 


Conceptually, it starts at the least-significant (rightmost) byte of the register and copies it 
to the specified byte in memory; then it copies bytes from register to memory until it 
reaches the high-order byte of the word in memory. No address exceptions due to 
alignment are possible. 


memory 
(little-endian) 


register 


dd 8 
sees belief Le ewe [ATSTFLEOTET EIA] sa 


SDR $24,3 ($0) 


memory 
(big-endian) 
register 


dd 8 
soos [eletelutatetel ewe (als ]e[o[e]F[S[H se 


SDR $24,5 ($0) 
address 8 | 8 | 9 |10/ 11/12] 13/14] 15] after 
address 0 Ic |DJE};F a] al 


Restrictions: 


None 
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Operation: (128-bit bus) 
vAddr < sign_extend (offset) +GPR [base] 31..0 
(pAddr, uncached) < AddressTranslation (vAddr, DATA, STORE) 
pAddr <— pAddrvesize-1)..4 || (PAddrs3..oxor BigE ndian‘) 
If (BigEndian =O) then 
pAddr <— pAddresize-31)..3 || 0? 
endif 
byte — vAddrz.o xor BigEndian* 
if(vAddr3 xor BigEndian =O) then 
dataquad < 0 || GPR [rt] (63-8:byte)..0 || O®yte 
else 
dataquad < GPR [rt] (63-8:byte)..0 || O8>yte || 0% 
endif 
StoreMemory (uncached, DOUBLEWORD-byte, dataquad, pAddr, vAddr, DATA) 


Given a doubleword in a register and a doubleword in memory, the operation of SDR is as 
follows: 
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SDR 
MSB 63 0 LSB 


Register oe Ee a 


Little-endian 15 14 13 12 11 


10 9 8 7 6 5 4 3 2 1 0 
Mewrcry [GE RS MS 


Little-endian byte ordering (BigEndianCPU = 0) 
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SDR 


MSB 63 0 LSB 


Register OSs ea 


Big-endian 2 3 4 5 6 7 8 9 10 11 12 13 14 #15 
Memory 
Litle-endian 15 14 13 12 11 10 9 8 7 6 5 4 3 2 


Big-endian byte ordering (BigEndianCPU = 0) 
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LEM Littleendian memory (BigEndianMem =0) 
BEM BigEndianMem =1 
Type AccessL ength sent to memory 
Offset pAddr3..o sent to memory 
Exceptions: 
TLB Refill 
TLB Invalid 
TLB Modified 


Address Error 


Programming Notes: 


None 
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SH Store Halfword SH 


31 26 25 21 20 16 15 0 
SH 
6 5 5 16 
MIPS | 
Format: SH rt, offset (base) 
Purpose: To store a halfword to memory. 
Description: memory [base + offset] < rt 


The least-significant 16-bit halfword if register rt is stored in memory at the location 
specified by the aligned effective address. The 16-bit signed offset is added to the contents 
of GPR base to form the effective address. 


Restrictions: 


The effective address must be naturally aligned. If the least-significant bit of the address 
is non-zero, an Address Error exception occurs. 
Operation: (128-bit bus) 
vAddr < sign_extend (offset) +GPR [base] 31..0 
if (vAddro) # 0 then SignalE xception (AddressE rror) endif 
(pAddr, uncached) < AddressTranslation (vAddr, DATA, STORE) 
pAddr <— pAddresize-1)..4 || (pAddrs..o xor (BigE ndian3|| 0)) 
byte — vAddrs.0o xor (BigEndian?|| 0) 
dataquad < GPR [rt] (127-8«byte)..0 || O8*yte 
StoreMemory (uncached, HALF WORD, dataquad, pAddr, vAddr, DATA) 
Exceptions: 


TLB Refill 
TLB Invalid 
TLB Modified 
Address Error 


Programming Notes: 


None 
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ore Shift Word Left Logical SLL 


21 20 16 15 11 10 
sat [ode | Le [= | oi 
ae 00000 abi 
MIPS | 
Format: SLL rd, rt, sa 
Purpose: To left shift a word by a fixed number of bits. 
Description: rd<tt<<sa 


The contents of the low-order 32-bit word of GPR rt are shifted left, inserting zeroes into 
the emptied bits; the word result is placed in GPR rd. The bit shift count is specified by sa. 
The result word is sign-extended. 


Restrictions: 


None 


Operation: 

Ss <Sa 

temp < GPR [rt]@1s)..0 || 05 

GPR [rd]e3..0 < sign_extend (temp31..0) 
Exceptions: 


None 


Programming Notes: 


Unlike nearly all other word operations the input operand does not have to be a properly 
sign-extended word value to produce a valid sign-extended 32-bit result. The result word 
is always sign extended into a 64-bit destination register; this instruction with a Zero shift 
amount truncates a 64-bit value to 32 bits and sign extends it and stores it in the 
destination register. 
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ame Shift Word Left Logical Variable aad 


21 20 16 15 11 10 
ame Pe Ts Tbe | 
eee 00000 = 00 
MIPS | 
Format: SLLV rd, rt, rs 
Purpose: To left shift a word by a variable number of bits. 
Description: rd<rt<<rs 


The contents of the low-order 32-bit word of GPR rt are shifted left, inserting zeroes into 
the emptied bits; the result word is placed in GPR ra. The bit shift count is specified by 
the low-order five bits of GPR rs. The result word is sign-extended. 


Restrictions: 
None 


Operation: 


Ss <— GP [rs]a.o 

temp < GPR [rt]@1-s)..0 || 05 

GPR [rd]e3..0 < sign_extend (temp3z1..0) 
Exceptions: 


None 


Programming Notes: 


None 
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Set on Less Than S LT 


16 15 11 10 


Leeren: SLT 
ae 0009 101 ud 0 


MIPS | 
Format: SLT rd, rs, rt 
Purpose: To record the result of a less-than comparison. 
Description: rd < (rs < rt) 


Compare the contents of GPR rs and GPR rt as signed integers and record the Boolean 
result of the comparison in GPR rd. If GPR rs is less than GPR rt the result is 1 (true), 


otherwise 0 (false). 


The arithmetic comparison does not cause an Integer Overflow exception. 


Restrictions: 


None 


Operation: 


if GPR [rs]e3.0 <GPR [rt] 63.0 then 
GPR [rd] 63.0 — OGPRLEN-1 || 7 
else 
GPR [rd] 63.0 < OGPRLEN 
endif 


Exceptions: 


None 


Programming Notes: 


None 
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Format: SLTI rt, rs, immediate 
Purpose: To record the result of a less-than comparison with a constant. 
Description: rt < (rs < immediate) 


Compare the contents of GPR rs and the 16-bit signed immediate as signed integers and 
record the Boolean result of the comparison in GPR rt. If GPR rs is less than immediate 
the result is 1 (true), otherwise 0 (false). 

The arithmetic comparison does not cause an Integer Overflow exception. 


Restrictions: 


None 


Operation: 


if GPR [rs] 63.0 <sign_extend (immediate) then 
GPR [rd] 63.0 < OGPRLEN-1 || 7 

else 
GPR [rd] 63.0 < OGPRLEN 

endif 


Exceptions: 


None 


Programming Notes: 


None 
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Format: SLTIU rt, rs, immediate 
Purpose: To record the result of an unsigned less-than comparison with a constant. 
Description: rt < (rs < immediate) 
Compare the contents of GPR rs and the sign-extended 16-bit immediate as unsigned 


integers and record the Boolean result of the comparison in GPR rt. lf GPR rsis less than 
immediate the result is 1 (true), otherwise 0 (false). 


Because the 16-bit /mmediate is sign-extended before comparison, the instruction is able 
to represent the smallest or largest unsigned numbers. The representable values are at 
the minimum [0, 32767] or maximum [max_unsigned-32767, max_unsigned] end of the 
unsigned range. 

The arithmetic comparison does not cause an Integer Overflow exception. 


Restrictions: 


None 


Operation: 
if (0 || GPR [rs] 63.0) <(0 || sign_extend (immediate)) then 
GPR [rd] 63.0 — OGPRLEN-1 |) 7 
else 
GPR [rd] 63..0 — OGPRLEN 
endif 


Exceptions: 


None 


Programming Notes: 


None 
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Format: SLTU rd, rs, rt 
Purpose: To record the result of an unsigned less-than comparison. 
Description: rd < (rs < rt) 


Compare the contents of GPR rs and GPR rt as unsigned integers and record the Boolean 
result of the comparison in GPR rd. If GPR rs is less than GPR rt the result is 1 (true), 
otherwise 0 (false). 

The arithmetic comparison does not cause an Integer Overflow exception. 


Restrictions: 


None 


Operation: 


if (O || GPR [rs] 63..0) <(O|| GPR [rt] 63.0) then 
GPR [rd] 63.0 — OGPRLEN-1 |) 7 

else 
GPR [rd] 63.0 < OGPRLEN 

endif 


Exceptions: 


None 


Programming Notes: 


None 
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Format: SRA rd, rt sa 
Purpose: To arithmetic right shift a word by a fixed number of bits. 
Description: rd<«rt>>sa (arithmetic) 


The contents of the low-order 32-bit word of GPR rt are shifted right, duplicating the sign- 
bit (bit 31) in the emptied bits; the word result is placed in GPR rd. The bit shift count is 
specified by sa. The result word is sign-extended. 


Restrictions: 


If GPR rt does not contain a sign-extended 32-bit value (bit 63..31 equal) then the result of 
the operation is undefined. 


Operation: 


if (NotWordValue (GPR [rt] 63..0 )) then UndefinedResult () endif 
Ss <Sa 
temp < (GPR [rt]s1) || GPR [rt]s1.s 
GPR [rd]e3.0 < sign_extend (temp31..0) 
Exceptions: 


None 


Programming Notes: 


None 
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Format: SRAV rd, rt, rs 
Purpose: To arithmetic right shift a word by a variable number of bits. 
Description: rd<rt>>rs (arithmetic) 


The contents of the low-order 32-bit word of GPR rt are shifted right, duplicating the sign- 
bit (bit 31) in the emptied bits; the word result is placed in GPR rd. The bit shift count is 
specified by the low-order five bits of GPR rs. The result word is sign-extended. 


Restrictions: 


If GPR rt does not contain a sign-extended 32-bit value (bit 63..31 equal) then the result of 
the operation is undefined. 


Operation: 


if (NotWordValue (GPR [rt] 63..0 )) then UndefinedResult () endif 
s <— GPR [rs]4..o 
temp < (GPR [rt]s1)5 || GPR [rt]s1.s 
GPR [rd]63.0 < sign_extend (tempaz..o) 
Exceptions: 


None 


Programming Notes: 


None 
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Format: SRL rd, rt, sa 
Purpose: To logical right shift a word by a fixed number of bits. 
Description: rd< rt>>sa (logical) 


The contents of the low-order 32-bit word of GPR rt are shifted right, inserting zeros into 
the emptied bits; the word result is placed in GPR rd. The bit shift count is specified by sa. 
The result word is sign-extended. 


Restrictions: 


If GPR rt does not contain a sign-extended 32-bit value (bit 63..31 equal) then the result of 
the operation is undefined. 


Operation: 
if (NotWordValue (GPR [rt] 63..0)) then UndefinedResult () endif 
Ss < Sa 
temp < 0§|| GPR [rt]s1.s 
GPR [rd]e3..0 < sign_extend(tempaz..o) 
Exceptions: 
None 


Programming Notes: 


None 
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Format: SRLV rd, rt, rs 
Purpose: To logical right shift a word by a variable number of bits. 
Descriptions: rd<rt>>rs (logical) 


The contents of the low-order 32-bit word of GPR rt are shifted right, inserting zeros into 
the emptied bits; the word result is placed in GPR ra. The bit shift count is specified by 
the low-order five bits of GPR rs. The result word is sign-extended. 


Restrictions: 


If GPR rt does not contain a sign-extended 32-bit value (bits 63..31 equal) then the result 
of the operation is undefined. 


Operation: 


if (NotWordValue (GPRIrt] 63..0)) then UndefinedResult () endif 
Ss <— GPR [rs]a..o 
temp < 0§|| GPR [rt]s1..s 
GPR [rd]63.0 < sign_extend (tempaz..o) 
Exceptions: 


None 


Programming Notes: 


None 
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Format: SUB rd, rs, rt 
Purpose: To subtract 32-bit integers. If overflow occurs, then trap. 
Description: rd<rs-rt 


The 32-bit word value in GPR rt is subtracted from the 32-bit value in GPR rsto produce a 
32-bit result. If the subtraction results in 32-bit 2’s complement arithmetic overflow then 
the destination register is not modified and an Integer Overflow exception occurs. If it 
does not overflow, the 32-bit result is placed into GPR rd. 


Restrictions: 


If either GPR rt or GPR rs do not contain sign-extended 32-bit values (bits 63..31 equal), 
then the result of the operation is undefined. 


Operation: 


if (NotWordValue (GPR[rs] 63.0) or NotWordValue (GPR[rt] 63..0)) then UndefinedResult () endif 
temp <— GPR [rs] 63.0 - GPR [rt]6s..o 
if (32_bit_arithmetic_overflow) then 
SignalE xception (I ntegerOverflow) 
else 
GPR [rd]és..0 < sign_extend (temps3i..o) 
endif 


Exceptions: 


Integer Overflow 


Programming Notes: 


SUBU performs the same arithmetic operation but, does not trap on overflow. 
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Format: SUBU 1d, rs, rt 


Purpose: To subtract 32-bit integers. 


MIPS | 


Description: rd<rs-rt 


The 32-bit word value in GPR rt is subtracted from the 32-bit value in GPR rs and the 32- 
bit arithmetic result is placed into GPR rd. 
Nointeger overflow exception occurs under any circumstances. 

Restrictions: 
If either GPR rt or GPR rs do not contain sign-extended 32-bit values (bits 63..31 equal), 
then the result of the operation is undefined. 


Operation: 


if (NotWordValue (GPRIrs] 63..0) or NotWordValue (GPR[rt] 63..0)) then UndefinedResult () endif 
temp < GPR [rs] 63.0 - GPR [rt]6s..o 
GPR [rd] 63..0< sign_extend (tempszi..o) 

Exceptions: 


None 


Programming Notes: 


The term “unsigned” in the instruction name is a misnomer; this operation is 32-bit 
modulo arithmetic that does not trap on overflow. It is appropriate for arithmetic which is 
not signed, such as address arithmetic, or integer arithmetic environments that ignore 
overflow, such as C language arithmetic. 
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Format: SW rt, offset (base) 
Purpose: To store a word to memory. 
Description: memory [base + offset] < rt 


The least-significant 32-bit word of register rt is stored in memory at the location specified 
by the aligned effective address. The 16-bit signed offset is added to the contents of GPR 
base to form the effective address. 


Restrictions: 


The effective address must be naturally aligned. If either of the two least-significant bits 

of the address are non-zero, an Address Error exception occurs. 
Operation: (128-bit bus) 

vAddr < sign_extend (offset) +GPR [base] 31..0 

if ( vAddri..o) # 0? then SignalE xception (AddressE rror) endif 

(pAddr, uncached) < AddressTranslation (vAddr, DATA, STORE) 

pAddr < pAddreesize-1).. 4|| (pAddrs..o xor (BigE ndian? || 02)) 

byte — vAddrs.o xor (BigEndian?|| 02) 

dataquad < GPR [rt] (127-8:byte)..0 || 08*yte 

StoreM emory (uncached, WORD, dataquad, pAddr, vAddr, DATA) 
Exceptions: 


TLB Refill 
TLB Invalid 
TLB Modified 
Address Error 


Programming Notes: 


None 
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Format: SWL rt, offset (base) 
Purpose: To store the more-significant part of a word to an unaligned memory address. 
Description: memory [base + offset] < rt 
Paired SWL and SWR instructions are used to store a word from a register into four 


consecutive bytes in memory starting at an arbitrary byte address. SWL stores the left 
(most-significant) bytes and SWR stores the right (least-significant) bytes. 


The SWL instruction adds its sign-extended 16-bit offset to the contents of GPR base to 
form an effective address which may specify an arbitrary byte. It alters only the word in 
memory which contains that byte. From one to four bytes will be stored, depending on the 
starting byte specified. 


Conceptually, it starts at the most-significant byte of the register and copies it to the 
specified byte in memory; then it copies bytes from register to memory until it reaches the 
low-order byte of the word in memory. 


No address exceptions due to alignment are possible. 


memory 
little-endian 


register 
address 4 


address 0 po{clie{a $24 


SWL $24,6 ($0) 


address 4 


address 0 


memory 
(big-endian) 


register 
address 4 


address 0 paleic| 


SWL $24,1 ($0) 


address 4 
after 
address 0 
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Restrictions: 


None 


Operation: 


vAddr < sign_extend (offset) +GPR [base] 31..0 
(pAddr, uncached) « AddressTranslation (vAddr, DATA, STORE) 
pAddr <— pAddreesize-1).. || (PAddr3..o xor BigE ndian‘) 
If (BigEndian =O) then 
pAddr < pAddrpsize-1)..2 || 02 
endif 
byte — vAddr1.0 xor BigEndian2 
if (vAddrs..2 xor BigEndian2) =002 then 
dataquad < 0% || 0'248*byte) || GPR[rt]31.. (24-8«byte) 
elseif (vAddr3..2 xor BigEndian2) =012 then 
dataquad < 0% || 0(248*byte) || GPR [rt]s1. (24-8:bytey || 032 
elseif (vAddr3..2 xor BigEndian2) = 102 then 
dataquad < 022 || 0'2*8*byte) || GPR [rt]s1. (24-8:byte) || 032 
elseif (vAddr3..2 xor BigEndian2) =112 then 
dataquad — 0(248*byte) || GPR [rt]sz.. (24-8«byte) |] 0% 
endif 
StoreM emory (uncached, byte, dataquad, pAddr, vAddr, DATA) 


Given a doubleword in a register and a doubleword in memory, the operation of SWL is as 
follows: 
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LEM Littleendian memory (BigEndianMem =0) 
BEM BigEndianMem =1 
Type AccessL ength sent to memory 
Offset pAddr3..o sent to memory 
Exceptions: 
TLB Refill 
TLB Invalid 
TLB Modified 


Address Error 


Programming Notes: 


None 
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MIPS | 
Format: SWR rt, offset (base) 
Purpose: To store the less-significant part of a word to an unaligned memory address. 


Description: memory [base + offset] < rt 


Paired SWL and SWR instructions are used to store a word from a register into four 
consecutive bytes in memory starting at an arbitrary byte address. SWL stores the left 
(most-significant) bytes and SWR stores the right (least-significant) bytes. 


The SWR instruction adds its sign-extended 16-bit offset to the contents of GPR base to 
form an effective address which may specify an arbitrary byte. It alters only the word in 
memory which contains that byte. From one to four bytes will be stored, depending on the 
starting byte specified. 


Conceptually, it starts at the least-significant (rightmost) byte of the register and copies it 
to the specified byte in memory; then copies bytes from register to memory until it reaches 
the high-order byte of the word in memory. 


No address exceptions due to alignment are possible. 


memory 
little-endian 
register 


address 4 
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Restrictions: 


None 


Operation: 


vAddr < sign_extend (offset) +GPR [base] 31..0 
(pAddr, uncached) « AddressTranslation (vAddr, DATA, STORE) 
pAddr <— pAddreesize-1).. || (PAddr3..o xor BigE ndian‘) 
If (BigEndian =O) then 
pAddr < pAddrpsize-1)..2 || 02 
endif 
byte — vAddr1.0 xor BigEndian2 
if (vAddrs..2 xor BigEndian2) =002 then 
dataquad < 0% || GPR [rt] (31-8+byte)..0 || O8*yte 
else if (vAddrs..2 xor BigEndian?) =012 then 
dataquad < 0 || GPR [rt] (31-8+byte..0 || O8*¥te || 022 
else if (vAddrs..2 xor BigEndian2) = 102 then 
dataquad < 02 || GPR [rt] (31-8byte)..0 || 08*yte || 0% 
else if (vAddrs..2 xor BigEndian2) =112 then 
dataquad <-GPR [rt] (31-8+byte)..0 || O8*Yte || 0% 
endif 
StoreM emory (uncached, WORD-byte, dataquad, pAddr, vAddr, DATA) 


Given a doubleword in a register and a doubleword in memory, the operation of SWR is as 
follows: 
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LEM Littleendian memory (BigEndianMem =0) 
BEM BigEndianMem =1 
Type AccessL ength sent to memory 
Offset pAddr3..0sent to memory 
Exceptions: 
TLB Refill 
TLB Invalid 
TLB Modified 


Address Error 


Programming Notes: 


None 
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31 26 25 11.10 65 0 
SPECIAL 0 SYNC 
6 15 5 6 
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Format: SYNC  (stype = Oxxxx) 


SYNC.L (stype = Oxxxx) 

SYNC.P (stype = 1Xxxxx) 
Purpose: To perform either a memory barrier operation or a pipeline barrier operation. 
Description: 


This instruction either interlocks the pipeline until all pending loads and stores are 
completed or all earlier issued instructions are completed. 


In case of the SYNC or the SYNC.L instructions (memory barrier) all pending loads and 
stores are retired. Loads are retired when the destination register is written. Stores are 
retired when the stored data (in store buffers or write buffers) is either stored in the data 
cache, or sent on the processor bus and SYSDACK* has been asserted. All uncached 
accelerated data gathering operation is terminated. The uncached accelerated buffer is 
invalidated. All bus read processes due to load/store/pref/cache instructions are completed. 
All pending bus write processes in the write back buffer are completed. 


In case of the SYNC.P instruction (pipeline barrier) all instructions prior to the barrier are 
completed before the instructions following the barrier operation are fetched. Note that 
the barrier operation does not wait for any instruction which was issued prior to the 
barrier operation but not retired (eg., multiply, divide, multicycle COP1 operations or a 
pending load which were issued prior to the barrier operation). 

Operation: 
SyncOperation (stype) 

Exceptions: 


None 
Programming Notes: 


The SYNC instruction (SYNC.P or SYNC.L) is not allowed in the branch delay slot of 
instructions which have branch delay slots. 
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SYSCALL System Call SYSCALL 


31 26 25 65 0 
SPECIAL SYSCALL 
6 20 6 
MIPS | 
Format: SYSCALL 
Purpose: To cause a System Call exception. 
Description: 


A system call exception occurs, immediately and unconditionally transferring control to 
the exception handler. 


The code field is available for use as software parameters, but is retrieved by the exception 
handler only by loading the contents of the memory word containing the instruction. 


Restrictions: 
None 


Operation: 
SignalE xception (SystemCall) 
Exceptions: 


System Call 


Programming Notes: 


None 


A-126 


TX 
TOSHIBA Appendix A CPU Instruction Set Details SE 


TEQ Trap if Equal TEQ 
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Format: TEQ rs, rt 
Purpose: To compare GPRs and do a conditional Trap. 
Description: if (rs = rt) then Trap 


Compare the contents of GPR rs and GPR rt as signed integers; if GPR rsis equal to GPR 
rt then take a Trap exception. 


The contents of the code field are ignored by hardware and may be used to encode 
information for system software. To retrieve the information, system software must load 
the instruction word from memory. 


Restrictions: 
None 


Operation: 


if GPR[rs]e3..0 =GPR[rt] 63.0 then 
SignalE xception (Trap) 
endif 


Exceptions: 


Trap 


Programming Notes: 


None 
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TEQI Trap if Equal Immediate TEQI 


31 26 25 21 20 16 15 0 
REGIMM TEQI 
6 5 5 16 
MIPS Il 
Format: TEQI rs, immediate 
Purpose: To compare a GPR to a constant and do a conditional Trap. 
Description: if (rs = immediate) then Trap 


Compare the contents of GPR rs and the 16-bit signed /mmediate as signed integer; if 
GPR rsis equal to immediate then taken a Trap exception. 


Restrictions: 
None 


Operation: 


if GPR [rs] 63.0 =sign_extend (immediate) then 
SignalE xception (Trap) 
endif 


Exceptions: 


Trap 


Programming Notes: 


None 
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TG = Trap if Greater or Equal TG Ee 
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SPECIAL TGE 
6 5 5 10 6 
MIPS Il 
Format: TGE rs, rt 
Purpose: To compare GPRs and do a conditional Trap. 
Description: if (rs > rt) then Trap 


Compare the contents of GPR rs and GPR rt as signed integers; if GPR rs is greater than 
or equal to GPR rt then take a Trap exception. 


The contents of the code field are ignored by hardware and may be used to encode 
information for system software. To retrieve the information, system software must load 
the instruction word from memory. 


Restrictions: 
None 


Operation: 


if GPR [rs] 63.0 => GPR [rt] 63.0 then 
SignalE xception (Trap) 
endif 


Exceptions: 


Trap 


Programming Notes: 


None 
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TG E| Trap if Greater or Equal Immediate TG E| 


31 26 25 21 20 16 15 0) 
REGIMM TGEl : F 
6 5 5 16 
MIPS Il 
Format: TGEI rs, immediate 
Purpose: To compare a GPR to a constant and do a conditional Trap. 
Description: if (rs = immediate) then Trap 


Compare the contents of GPR rs and the 16-bit signed immediate as signed integers; if 
GPR rsis greater than or equal to immediate then take a Trap exception. 


Restrictions: 
None 


Operation: 


if GPR [rs] 63.0 = sign_extend (immediate) then 
SignalE xception (Trap) 
endif 


Exceptions: 


Trap 


Programming Notes: 


None 
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TGEIU Trap if Greater or Equal Immediate Unsigned TGEIU 


31 26 25 21 20 16 15 0 
000001 01001 
6 5 5 16 
MIPS Il 
Format: TGEIU rs, immediate 
Purpose: To compare a GPR to a constant and do a conditional Trap. 
Description: if (rs = immediate) then Trap 


Compare the contents of GPR rs and the 16-bit sign-extended immediate as unsigned 
integers; if GPR rsis greater than or equal to immediate then take a Trap exception. 


Because the 16-bit /mmediate is sign-extended before comparison, the instruction is able 
to represent the smallest or largest unsigned numbers. The representable values are at 
the minimum [0,32767] or maximum [max_unsigned-32767, max_unsigned] end of the 
unsigned range. 


Restrictions: 
None 


Operation: 


if (0 || GPR[rs] 63.0) > (0 || sign_extend (immediate)) then 
SignalE xception (Trap) 
endif 


Exceptions: 


Trap 


Programming Notes: 


None 
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TG EU Trap if Greater or Equal Unsigned TG EU 


31 26 25 21 20 16 15 6 5 0 
SPECIAL TGEU 
6 5 5 10 6 
MIPS Il 
Format: TGEU fs, rt 
Purpose: To compare GPRs and do a conditional Trap. 
Description: if (rs > rt) then Trap 


Compare the contents of GPR rs and GPR rt as unsigned integers; if GPR rs is greater 
than or equal to GPR rt then takea Trap exception. 


The contents of the code field are ignored by hardware and may be used to encode 
information for system software. To retrieve the information, system software must load 
the instruction word from memory. 


Restrictions: 
None 


Operation: 


if (0 || GPR[rs] 63..0)) = (O || GPR[rt] 63.0) then 
SignalE xception (Trap) 
endif 


Exceptions: 


Trap 


Programming Notes: 


None 
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T LT Trap if Less Than T LT 


31 26 25 21 20 16 15 6 5 0 
SPECIAL TLT 
6 5 5 10 6 
MIPS Il 
Format: TLT rs, rt 
Purpose: To compare GPRs and do a conditional Trap. 
Description: if (rs < rt) then Trap 


Compare the contents of GPR rs and GPR rs as signed integers; if GPR rs is less than 
GPR rt then take a Trap exception. 


The contents of the code field are ignored by hardware and may be used to encode 
information for system software. To retrieve the information, system software must load 
the instruction word from memory. 


Restrictions: 
None 


Operation: 


if GPR [rs] 63. <GPR [rt] 63..0 then 
SignalE xception (Trap) 
endif 


Exceptions: 


Trap 


Programming Notes: 


None 
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TLTI Trap if Less Than Immediate TLTI 


31 26 25 21 20 16 15 0 
REGIMM TLTI : 
6 5 5 16 
MIPS Il 
Format: TLTI rs, immediate 
Purpose: To compare a GPR to a constant and do a conditional Trap. 
Description: if (rs < immediate) then Trap 


Compare the contents of GPR rs and the 16-bit signed /mmediate as signed integers; if 
GPR rsis less than immediate then take a Trap exception. 


Restrictions: 
None 


Operation: 


if GPR[rs] 63.0 <sign_extend (immediate) then 
SignalE xception (Trap) 
endif 


Exceptions: 


Trap 


Programming Notes: 


None 


A-134 


TX 
TOSHIBA Appendix A CPU Instruction Set Details SE 


TLTl U Trap if Less Than Immediate Unsigned TLTI U 


31 26 25 21 20 16 15 0 
000001 01011 
6 5 5 16 
MIPS Il 
Format: TLTIU rs, immediate 
Purpose: To compare a GPR to a constant and do a conditional Trap. 
Description: if (rs < immediate) then Trap 


Compare the contents of GPR rs and the 16-bit sign-extended immediate as unsigned 
integers; if GPR rsis less than immediate then take a Trap exception. 


Because the 16-bit /mmediate is sign-extended before comparison, the instruction is able 
to represent the smallest or largest unsigned numbers. The representable values are at 
the minimum [0, 32767] or maximum [max_unsigned-32767, max_unsigned] end of the 
unsigned range. 


Restrictions: 
None 


Operation: 


if (0 || GPR[rs] 63.0) <(0 || sign_extend (immediate)) then 
SignalE xception (Trap) 
endif 


Exceptions: 


Trap 


Programming Notes: 


None 
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TLTU Trap if Less Than Unsigned TLTU 


31 26 25 21 20 16 15 6 5 0 
SPECIAL TLTU 
6 5 5 10 6 
MIPS Il 
Format: TLTU fs, rt 
Purpose: To compare GPRs and do a conditional Trap. 
Description: if (rs < rt) then Trap 


Compare the contents of GPR rs and GPR rt as unsigned integers; if GPR rs is less than 
GPR rt then take a Trap exception. 


The contents of the code field are ignored by hardware and may be used to encode 
information for system software. To retrieve the information, system software must load 
the instruction word from memory. 


Restrictions: 
None 


Operation: 


if (0 || GPR[rs] 63.0) <(0 || GPR[rt] 63.0) then 
SignalE xception (Trap) 
endif 


Exceptions: 


Trap 


Programming Notes: 


None 
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TNE Trap if Not Equal TNE 


31 26 25 21 20 16 15 6 5 0 
SPECIAL TNE 
6 5 5 10 6 
MIPS Il 
Format: TNE rs, rt 
Purpose: To compare GPRs and do a conditional Trap. 
Description: if (rs # rt) then Trap 


Compare the contents of GPR rs and GPR rt as signed integers; if GPR rs is not equal to 
GPR rt then take a Trap exception. 


The contents of the code field are ignored by hardware and may be used to encode 
information for system software. To retrieve the information, system software must load 
the instruction word from memory. 


Restrictions: 
None 


Operation: 


if GPR[rs] 63.04 GPR[rt] 63.0 then 
SignalE xception (Trap) 
endif 


Exceptions: 


Trap 


Programming Notes: 


None 
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TN E| Trap if Not Equal Immediate TN E| 


31 26 25 21 20 16 15 0 
REGIMM TNEI : 
6 5 5 16 
MIPS Il 
Format: TNEI rs, immediate 
Purpose: To compare a GPR to a constant and do a conditional Trap. 
Description: if (rs # immediate) then Trap 


Compare the contents of GPR rs and the 16-bit signed /mmediate as signed integers; if 
GPR rsis not equal to immediatethen take a Trap exception. 


Restriction: 
None 


Operation: 


if GPR[rs] 63.0 # sign_extend (immediate) then 
SignalE xception (Trap) 
endif 


Exceptions: 


Trap 


Programming Notes: 


None 


A-138 


TX 
TOSHIBA Appendix A CPU Instruction Set Details SE 


oe Exclusive OR ao 


21 20 16 15 11 10 
ames Ls Ls [abe] 
ae 00000 1 a 10 
MIPS | 
Format: XOR rd, rs, rt 
Purpose: To do a bitwise logical EXCLUSIVE OR. 
Description: rd << rs XOR rt 


Combine the contents of GPR rs and GPR rt in a bitwise logical exclusive OR operation 
and place the result into GPR rd. 


Restrictions: 
None 


Operation: 
GPR[rd]63.0< GPRIrs] 63.0 xor GPR[rt] 63..0 
Exceptions: 


None 


Programming Notes: 


None 


A-139 


TX 
TOSHIBA Appendix A CPU Instruction Set Details SE 


XORI Exclusive OR Immediate XORI 


31 26 25 21 20 16 15 0) 
XORI : 
6 5 5 16 
MIPS | 
Format: XORI rt, rs, immediate 
Purpose: To do a bitwise logical EXCLUSIVE OR with a constant. 
Description: rt — rs XOR immediate 


Combine the contents of GPR rs and the 16-bit zero-extended immediate in a bitwise 
logical exclusive OR operation and place the result into GPR rt. 


Restrictions: 
None 


Operation: 
GPR[rt] 63.0 <- GPR[rs] 63.0 xor zero_extend (immediate) 
Exceptions: 


None 


Programming Notes: 


None 
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A.5 CPU Instruction Encoding 


The following table shows the OpCode encoding of CPU instructions for the MIPS IV 
architecture. This architecture level includes all MIPS |, MIPS II, MIPS III and some 
MIPS IV instructions. Even though the OpCodes for MTSAB, MTSAH, MFSA, MTSA, LQ, 
and SQ are shown in this OpCode table, these instructions are described in Appendix B 
since they are C790-specific instructions. 


Coprocessor 0 (COPO - System Control Processor), Coprocessor 1 (COP 1 - Floating-point 
Processor) and C790 specific instructions are described in separate sections. 


31 26 
OpCode 
OpCode | bits 28..26 Instructions encoded by OpCode field 


bits a 1 2 3 4 5 6 7 
31..29 001 010 011 100 101 110 111 
0 000 
1 001 
2 010 
3 011 
4 100 
5 101 
6 110 
7 111 
31 26 5 0 
OpCode = 
SPECIAL 
bits 2..0 Instructions encoded by function field when OpCode field = SPECIAL 
bits 0 1 2 3 4 5 6 7 
5.3 000 001 010 011 100 101 110 111 
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31 26 20 16 0 
OpCode = 
REGIMM 


[ «| bits 18..16 Instructions encoded by rt field when OpCode field = REGIMM 


poe a i aa Fe 1 so 1 _ 1 : 1 i 


This OpCode is reserved for future use. An attempt to execute it causes a 
Reserved Instruction exception. 


This OpCode is reserved for one of the following instructions which are 
currently not supported: DMULT, DMULTU, DDIV, DDIVU, LL, LLD, SC, 
SCD, LWC2, SWC2. An attempt to execute it causes a Reserved Instruction 
exception. 


This OpCode indicates an instruction class. The instruction word must be 
further decoded by examining additional tables that show the values for 
another instruction field. 


This OpCode indicates C790 specific instructions. It is included in the table 
because it uses a primary OpCode in the instruction encoding map. 


This OpCode is a coprocessor operation, not a CPU operation. If the 
processor state does not allow access to the specified coprocessor, the 
instruction causes a Coprocessor Unusable exception. It is included in the 
table because it uses a primary OpCode in the instruction encoding map. 


This OpCode indicates the class of Coprocessor 0 (System Control Processor ) 
instructions. If the processor state does not allow access to the coprocessor 0, 
the instruction causes a Coprocessor Unusable exception. Further encoding 
information for this instruction class is in the COPO Instruction Encoding 
tables. 


This OpCode indicates the class of Coprocessor 1 (Floating-Point Processor) 
instructions. If the processor state does not allow access to the coprocessor 1, 
the instruction causes a Coprocessor Unusable exception. Further encoding 
information for this instruction class is in the COP1 Instruction Encoding 
tables. 
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B. C790-Specific Instruction Set Details 


This appendix provides a detailed description of the operation of each C790-specific 
instruction. The C790’s instruction set is extended from the original MIPS ISA in order to 
support embedded applications. There are three classes of C790-specific instructions: 


e Threeoperand Multiply and Multiply-Add instructions 
e Multiply and Multiply-Add instructions for pipeline 1 


e Multimedia instructions 
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B.1 Conventions Used in This Chapter 
The H/ and LO registers are 128 bits wide. Some instructions operate on either the lower 
or the upper doublewords of these registers, and there are also instructions which operate 
on the complete registers. 
The following terminology is used for these registers. 


e Strictly speaking, a reference to the least-significant doubleword of the H/ and LO 
register should use the names H/O and LOO. However, to be consistent with 
existing MIPS terminology, these registers are just called H/ and LO. 


e Reference to the upper doublewords of the H/ and LO registers is made by using 
the names H/1 and LO1. 


e Occasionally, based on context, the complete 128-bit registers are referred to as H/ 
and LO. 


e Any portion of these registers can use the names H/ and LO with the appropriate 


bit width specifications. Thus H/1 can be referred to as H/i27.64 and LO1 can be 
referred to as L Oi27..64, etc. 


B.1.1 Instruction Description Notation and Functions 
The Operation sections of the instruction descriptions describe the operation performed by 


each instruction using a high-level language notation, or pseudocode. Symbols, functions, 
and structures used in the Operation sections are described here. 


B.1.2 Pseudocode Language Statement Execution 


Each of the high-level language statements in an operation description is executed in 
sequential order (as modified by conditional and loop constructs). 


B.1.3 Pseudocode Symbols 


Special symbols used in the notation are described in Appendix A. 


B.2 Definitions for Pseudocode Functions Used in Operation 
Descriptions 
A variety of functions are used in the pseudocode descriptions to make the pseudocode 


more readable and also to abstract implementation-specific behavior. These functions are 
defined in Appendix A. 
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B.3 Summary of C790-Specific Instructions 


B.3.1 Multiply and Multiply-Add Instructions 


¢ Three-Operand Multiply and Multiply-Add (4 instructions) 


MADD Multiply/Add 
MADDU Multiply/Add Unsigned 
MULT Multiply (3-operand) 


MULTU Multiply Unsigned (3-operand) 


e Multiply Instructions for Pipeline 1 (10 instructions) 
MULT1 Multiply Pipeline 1 
MULTUL Multiply Unsigned Pipeline 1 
DIV1 Divide Pipeline 1 
DIVU1 Divide Unsigned Pipeline 1 
MADD1 Multiply-Add Pipeline 1 
MADDU1 = Multiply-Add Unsigned Pipeline 1 


MFHI1 Move From HI 1 Register 
MFLO1 Move From LO] Register 
MTHI1 Move To HI1 Register 


MTLO1 Move ToLO!1 Register 


B.3.2 Multimedia Instructions 


e = Arithmetic (19 instructions) 
PADDB Parallel Add Byte 


PSUBB Parallel Subtract Byte 
PADDH Parallel Add Halfword 
PSUBH Parallel Subtract Halfword 


PADDW Parallel Add Word 

PSUBW Parallel Subtract Word 

PADSBH Parallel Add/Subtract Halfword 

PADDSB Parallel Add with Signed Saturation Byte 
PSUBSB Parallel Subtract with Signed Saturation Byte 
PADDSH Parallel Add with Signed Saturation Halfword 
PSUBSH Parallel Subtract with Signed Saturation H alfword 
PADDSW Parallel Add with Signed Saturation Word 
PSUBSW Parallel Subtract with Signed Saturation Word 
PADDUB Parallel Add with Unsigned saturation Byte 
PSUBUB Parallel Subtract with Unsigned saturation Byte 
PADDUH Parallel Add with Unsigned saturation Halfword 
PSUBUH Parallel Subtract with Unsigned saturation Halfword 
PADDUW _ Parallel Add with Unsigned saturation Word 
PSUBUW Parallel Subtract with Unsigned saturation Word 
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e Min/Max (4 instructions) 
PMAXH Parallel Maximum Halfword 
PMINH Parallel Minimum Halfword 
PMAXW Parallel Maximum Word 
PMINW Parallel Minimum Word 


e Absolute (2 instructions) 


PABSH Parallel Absolute Halfword 
PABSW Parallel Absolute Word 


e Logical (4 instructions) 


PAND Parallel AND 
POR Parallel OR 

PXOR Parallel XOR 
PNOR Parallel NOR 


e = Shift (9 instructions) 


PSLLH Parallel Shift Left Logical Halfword 
PSRLH Parallel Shift Right Logical Halfword 
PSRAH Parallel Shift Right Arithmetic Halfword 


PSLLW Parallel Shift Left Logical Word 

PSRLW Parallel Shift Right Logical Word 

PSRAW Parallel Shift Right Arithmetic Word 
PSLLVW Parallel Shift Left Logical Variable Word 
PSRLVW Parallel Shift Right Logical Variable Word 
PSRAVW Parallel Shift Right Arithmetic Variable Word 


e Compare (6 instructions) 


PCGTB Parallel Compare for Greater Than Byte 
PCEQB Parallel Compare for Equal Byte 
PCGTH Parallel Compare for Greater Than Halfword 


PCEQH Parallel Compare for Equal Halfword 
PCGTW Parallel Compare for Greater Than Word 
PCEQW Parallel Compare for Equal Word 


e LZC (1 instruction) 
PLZCW Parallel Leading Zero or One Count Word 


¢ Quadword Load and Store (2 instructions) 


LQ Load Quadword 
SQ Store Quadword 


B-4 


TOSHIBA 


TX 
Appendix B C790-Specific Instruction Set Details ee” 


Multiply and Divide (19 instructions) 


PMULTW 
PMULTUW 
PDIVW 
PDIVUW 
PMADDW 
PMADDUW 
PMSUBW 
PMULTH 
PMADDH 
PMSUBH 
PHMADH 
PHMSBH 
PDIVBW 
PMFHI 
PMFLO 
PMTHI 
PMTLO 
PMFHL 
PMTHL 


Parallel Multiply Word 

Parallel Multiply Unsigned Word 

Parallel Divide Word 

Parallel Divide Unsigned Word 

Parallel Multiply-Add Word 

Parallel Multiply-Add Unsigned Word 
Parallel Multiply-Subtract Word 

Parallel Multiply Halfword 

Parallel Multiply-Add Halfword 

Parallel Multiply-Subtract Halfword 
Parallel Horizontal Multiply-Add Halfword 
Parallel Horizontal Multiply-Subtract Halfword 
Parallel Divide Broadcast Word 

Parallel Move From HI Register 

Parallel Move From LO Register 

Parallel Move TOHI Register 

Parallel Move TOLO Register 

Parallel Move From HI/LO Register 
Parallel Move TOHI/LO Register 


Pack/Extend (11 instructions) 


PPAC5 
PPACB 
PPACH 
PPACW 
PEXT5 
PEXTUB 
PEXTLB 
PEXTUH 
PEXTLH 
PEXTUW 
PEXTLW 


Parallel Pack to 5 bits 

Parallel Pack to Byte 

Parallel Pack to Halfword 

Parallel Pack to Word 

Parallel Extend Upper from 5 bits 
Parallel Extend Upper from Byte 
Parallel Extend Lower from Byte 
Parallel Extend Upper from Halfword 
Parallel Extend Lower from Halfword 
Parallel Extend Upper from Word 
Parallel Extend Lower from Word 


Others (16 instructions) 


PCPYH 
PCPYLD 
PCPYUD 
PREVH 
PINTH 
PINTEH 
PEXEH 
PEXCH 
PEXEW 
PEXCW 
QF SRV 
MFSA 
MTSA 
MTSAB 
MTSAH 
PROT3W 


Parallel Copy Halfword 

Parallel Copy Lower Doubleword 

Parallel Copy Upper Doubleword 

Parallel Reverse Halfword 

Parallel Interleave Halfword 

Parallel Interleave Even Halfword 

Parallel Exchange Even Halfword 

Parallel Exchange Center Halfword 
Parallel Exchange Even Word 

Parallel Exchange Center Word 

Quadword Funnel Shift Right Variable 
Move from Shift Amount Register 

Move to Shift Amount Register 

Move Byte Count to Shift Amount Register 
Move Halfword Count to Shift Amount Register 
Parallel Rotate 3 Words 
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B.4 Instruction Set Details 
In the following sections, details are provided for each of the C790-specific instructions. 


Exceptions that may occur due to the execution of each instruction are listed after the 
description of each instruction. Descriptions of the immediate cause and manner of 
handling exceptions are omitted from the instruction descriptions in this appendix. 


B-6 


TX 
TOSHIBA Appendix B C790-Specific Instruction Set Details Sie 


eta Divide Word Pipeline 1 DIV1 


26 25 21 20 16 15 65 0 
roe Te Te | 
011 — 0000000000 01 os 0 
C790 
Format: DIV1 rs, rt 
Purpose: To divide 32-bit signed integers using pipeline 1. 
Description: (LO1, HI1) ¢ rs/rt 


The 32-bit value in GPR rs is divided by the 32-bit value in GPR rt, treating both operands 
as signed values. The 32-bit quotient is placed into special register LO1 (= LOi27.64) and 
the 32-bit remainder is placed into special register H/ 1 (= H1/127.6a). 


No arithmetic exception occurs under any circumstances. 


Restrictions: 


If either GPR rt or GPR rs do not contain sign-extended 32-bit values (bits 63..31 equal), 
then the result of the operation will be undefined. 


If the divisor in GPR rt is zero, the arithmetic result value will be undefined. 


Operation: 
if (NotWordValue(GPRIrs]) or NotWordValue (GPR[rt])) then UndefinedResult() endif 
q < GPRIrs]Js1..0 div GPR[rt]s1..0 
r < GPRI[rs]31..0 mod GPR[rt]31..0 


LO127..64 < (q 31) || q 31.0 
Hl 127.64 < (r 31)?2 || r 31.0 


Supplementary Explanation: 


Normally, when Ox80000000 (-2147483648) the signed minimum value is divided by 
OxFFFFFFFF (-1), the operation will result in an overflow. However, in this instruction an 
overflow exception doesn’t occur and the result will be as follows: 


Quotient is Ox80000000 (-2147483648), and remainder is OxOO000000 (0). 


This sign of the quotient and the remainder is based on the signs of the dividend and the 
divisor as shown in the table below: 
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Table B-1. Quotient and Remainder Signs 


[oividens | vsor | _Guotent | Remainaer | 


Exceptions: 


None 


Programming Notes: 


In C790, the integer divide operation proceeds asynchronously and allows other CPU 
instructions to execute before it is retired. An attempt to read LO1 or H/1 registers before 
the results are written will cause an interlock until the results are ready. Out-of-order 
execution does not affect the program result, but offers an opportunity for performance 
improvement by scheduling the divide so that other instructions can execute in parallel. 


No arithmetic exception occurs under any circumstances. Divide-by-zero or overflow 
conditions should be detected by instructions preceding the divide instruction. If the 
divide is asynchronous then the zero-divisor check can execute in parallel with the divide. 
The action taken on either divide-by-zero or overflow is either a convention within the 
program itself or more typically, the system software; one possibility is to take a BREAK 
exception with a code field value to signal the problem to the system software. 


As an example, the C programming language in a UNIX environment expects division by 
zero to either terminate the program or execute a program-specified signal handler. C 
does not expect overflow to cause any exceptional condition. If the C compiler uses a divide 
instruction, it also emits code to test for a zero divisor and execute a BREAK instruction to 
inform the operating system if one is detected. 
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a Divide Unsigned Word Pipeline 1 DIVU1 


26 25 21 20 16 15 65 0 
roe = Te Tn | BBR 
011 aa 0000000000 01 ut 1 
C790 
Format: DIVU1 rs, rt 
Purpose: To divide 32-bit unsigned integers using pipeline 1. 
Description: (LO1, HI1) ¢ rs/rt 


The 32-bit value in GPR rs is divided by the 32-bit value in GPR rt, treating both operands 
as unsigned values. The 32-bit quotient is placed into special register L O1 (= L Oi27..64) and 
the 32-bit remainder is placed into special register H/1 (= H1/127.6a). 


No arithmetic exception occurs under any circumstances. 


Restrictions: 


If either GPR rt or GPR rs do not contain zero-extended 32-bit values (bits 63..32 equal 
zero), then the result of the operation is undefined. 


If the divisor in GPR rt is zero, the arithmetic result will be undefined. 


Operation: 
if (NotWordValue (GPRIrs]) or NotWordValue (GPR[rt])) then UndefinedResult() endif 
q < (0]|| GPR[rs]s:..0) div (0 || GPR[rt]s1..0) 


r < (0|| GPR[rs]s1..0) mod (0 || GPR[rt]s:..0) 
LO127..64 < (q 31)? || q 31.0 
Hl 127.64 < (r 31)?2 || r 31.0 


Exceptions: 


None 


Programming Notes: 


See the Programming Notes for the DIV1 instruction. 
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Load Quadword LQ 


26 25 21 20 16 15 0 
LQ 
16 
C790 
Format: LQ rt, offset (base) 
Purpose: To load a quadword from memory. 
Description: rt — memory [base + offset] 


The contents of the 128-bit quadword at the memory location specified by the effective 
address are fetched and placed in the 128-bit GPR rt. The 16-bit signed offset is added to 
the contents of GPR base register to form the effective address. The least-significant four 
bits of the effective address are masked to zero (effectively creating an aligned address) 
before being used to access memory. No address exceptions due to alignment are possible. 


Restriction: 
The effective address doesn’t have to be naturally aligned. The least significant 4 bits of 
the effective address are ignored. 
Operations: 
vAddr < sign_extend (offset) +GPR [base]s1..0 
vAddr3..0 =0* 
(pAddr, uncached) < AddressTranslation (vAddr, DATA, LOAD) 
memquad <- LoadMemory (uncached, QUADWORD, pAddr, vAddr, DATA) 
GPR[rt]127..0 <- memquad 


Exceptions: 


TLB Refill 
TLB Invalid 
Address Error 
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MADD Multiply-Add word MADD 


31 26 25 21 20 16 15 11 10 6 5 0 
MMI 0 MADD 
6 5 5 5 5 6 
C790 
Format: MADD rs, rt 
MADD rd, rs, rt 
Purpose: To multiply 32-bit signed integers and add. 
Description: (rd, HI, LO) < (HI, LO) + rs x rt 


The 32-bit word value in GPR rt is multiplied by the 32-bit value in GPR rs, treating both 
operands as signed values, to produce a 64-bit multiply result. The 64-bit multiply result 
is added to the contents in special registers H/ and LO. The low-order 32-bit word of the 
result is placed into special register LO and GPR rd, and the high-order 32-bit word of the 
result is placed into special register H/. 


No arithmetic exception occurs under any circumstances. 


If GPR rdis omitted in assembly language, 0 is used as the default value. 


Restrictions: 


If either GPR rt or GPR rs do not contain sign-extended 32-bit values (bits 63..31 equal), 
then the result of the operation will be undefined. 


Operation: 
if (NotWordValue (GPR[rs]) or NotWordValue (GPR[rt])) then UndefinedResult() endif 
prod < (HI131..0 || LO31.0) +GPR[rs]31.0 * GPR[rt]31..0 
L O63..0 < (prod 31)? || prod3i..o 
H1 63.0 < (prod 63)? || prodes3..32 
GPR[rd]e3.0 < (prod 31)? || prodsi..o 
Exceptions: 
None 


Programming Notes: 


In C790, the integer multiply accumulate operation proceeds asynchronously and allows 
other CPU instructions to execute before it is retired. An attempt to read LO or HI/ 
registers before the results are written will cause an interlock until the results are ready. 
Asynchronous execution does not affect the program result, but offers an opportunity for 
performance improvement by scheduling the multiply so that other instructions can 
execute in parallel. 
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iciemdac Multiply-Add word Pipeline 1 es 


26 25 21 20 16 15 11 10 
el ee le oe 
011 : 00 00000 1 aa 
C790 
Format: MADD‘1 rs, rt 
MADD1 rd, rs, rt 
Purpose: To multiply 32-bit signed integers and add in Pipeline 1. 
Description: (rd, HlI1, LO1) + (HI, LO1) + rs x rt 


The 32-bit word value in GPR rt is multiplied by the 32-bit value in GPR rs, treating both 
operands as signed values, to produce a 64-bit multiply result. The 64-bit multiply result 
is added to the contents in special registers H/1 (=H 1127.64) and LO1 (=L Ou27..64). The low- 
order 32-bit word of the result is placed into special register LO1 and GPR rd, and the 
high-order 32-bit word of the result is placed into special register H/1. 


No arithmetic exception occurs under any circumstances. 


If GPR rdis omitted in assembly language, 0 is used as the default value. 


Restrictions: 


If either GPR rt or GPR rs do not contain sign-extended 32-bit values (bits 63..31 equal), 
then the result of the operation will be undefined. 


Operation: 
if (NotWordValue (GPR[rs]) or NotWordValue (GPR[rt])) then UndefinedResult() endif 
prod < (Hlo9s..64 || LOos..64) + GPR[rs]s1..0 * GPR[rt]s1..0 


LO127..64 < (prod 31)? || prod3i..o 
H1 127.64 < (prod 63)? || prodes..32 
GPR[rd]e3.0 < (prod 31)22 || prodsi..o 


Exceptions: 


None 


Programming Notes: 


In the C790, the integer multiply accumulate operation proceeds asynchronously and 
allows other CPU instructions to execute before it is retired. An attempt to read LO1 or 
HI1 registers before the results are written will cause an interlock until the results are 
ready. Asynchronous execution does not affect the program result, but offers an 
opportunity for performance improvement by scheduling the multiply so that other 
instructions can execute in parallel. 
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Format: MADDU fs, rt 
MADDU td, rs, rt 
Purpose: To multiply 32-bit unsigned integers and add. 
Description: (rd, Hl, LO) < (HI, LO) + rs x rt 


The 32-bit word value in GPR rt is multiplied by the 32-bit value in GPR rs, treating both 
operands as unsigned values, to produce a 64-bit multiply result. The 64-bit multiply 
result is added to the contents in special registers H/ and LO. The low-order 32-bit word of 
the result is placed into special register LO and GPR rd, and the high-order 32-bit word of 
the result is placed into special register H/. 


No arithmetic exception occurs under any circumstances. 


If GPR rdis omitted in assembly language, 0 is used as the default value. 
Restrictions: 


If either GPR rt or GPR rs do not contain zero-extended 32-bit values (bits 63..32 equal 
zero), then the result of the operation will be undefined. 


Operation: 
if (NotWordValue (GPR[rs]) or NotWordValue (GPR[rt])) then UndefinedResult() endif 
prod © (HI 31.0 || LO31..0) +(0 || GPR[rs]sz..0) * (0 |] GPR[rt]s1..0) 
L O¢3..0 < (prod 31)? || prodsi..o 
H1 63.0 < (prod 63)? || prodes..32 
GPR[rd] 63.0 < (prod 31)22 || prodsz..o 
Exceptions: 
None 


Programming Notes: 


See the Programming Notes for the MADD instruction 


B-13 


TX 
TOSHIBA Appendix B C790-Specific Instruction Set Details We 


i ADDU1 Multiply-Add Unsigned word Pipeline 1 M cad 


26 25 21 20 16 15 11 10 
foe Te Ts [abe | 
011 : 00 00000 1 — 
C790 
Format: MADDU1 rs, rt 
MADDU1 rd, rs, rt 
Purpose: To multiply 32-bit unsigned integers and add in Pipeline 1. 
Description: (rd, HlI1, LO1) + (HI1, LO1) + rs x rt 


The 32-bit value in GPR rt is multiplied by the 32-bit value in GPR rs, treating both 
operands as unsigned values, to produce a 64-bit multiply result. The 64-bit multiply 
result is added to the contents in special registers H/1 (= Hli27.64) and LOI (= LOi27.«a). 
The low-order 32-bit word of the result is placed into special register LO1 and GPR rd, 
and the high-order 32-bit word of the result is placed into special register H/1. 


No arithmetic exception occurs under any circumstances. 


If GPR rdis omitted in assembly language, 0 is used as the default value. 


Restrictions: 


If either GPR rt or GPR rs do not contain zero-extended 32-bit values (bits 63..32 equal 
zero), then the result of the operation will be undefined. 


Operation: 
if (NotWordValue (GPR[rs]) or NotWordValue (GPR[rt])) then UndefinedResult() endif 
prod < (H195.64 || LOos..64) + (0 || GPR[rs]s1..0) * (0 || GPR[rt]sz..0) 
LO127..64 < (prod 31)? || prod3i..o 


( 
H1 127.64 < (prod 63)? || prodes..32 
GPR[rd]e3.0 < (prod 31)22 || prodsi..o 


Exceptions: 


None 


Programming Notes: 


See the Programming Notes for the MADD1 instruction 
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C790 
Format: MFHI1 rd 
Purpose: To copy the special purpose register HI1 to a GPR. 
Description: rd < HIi1 


The contents of special register H/1 (=H1/127.64) are loaded into GPR rd. 
Restrictions: 
None 


Operation: 
GPR[rd]e3..0 <— H1 127.64 
Exceptions: 


None 
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Format: MFLO1 rd 
Purpose: To copy the special purpose LO1 register to a GPR. 
Description: rd<« LO1 


The contents of special register L O1 (= LOi127.64) are loaded into GPR rd. 
Restrictions: 
None 


Operation: 
GPRIrd]e3..0 < LO127..64 
Exceptions: 


None 
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Format: MFSA rd 
Purpose: To copy the shift amount register SA to a GPR. 
Description: rd<SA 


The contents of SA, the special register storing the funnel shift amount, is loaded into 
GPR rd. Note that the shift amount is encoded in SA in an implementation-defined 
manner. Therefore, it is not meaningful for software to operate on the value returned in rd. 
The sole purpose of this instruction is to permit the shift amount to be saved during a 
context switch. The MTSA instruction should be used to restore the state of SA. 


Restrictions: 
None 


Operation: 
GPRIrdle3.0<- SA 
Exceptions: 


None 


Implementation Note: 


This instruction executes only in pipeline 0. 
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Format: MTHI1 rs 
Purpose: To copy a GPR to the special purpose register HI1. 
Description: Hl1 < rs 


The contents of GPR rs are loaded into special register H/1 (= H1/127.6a). 
Restrictions: 
None 


Operation: 
H1 127.64 < GPR[rs]e3..0 
Exceptions: 


None 


Programming Notes: 


None 
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Format: MTLO1 rs 
Purpose: To copy a GPR to the special purpose register LO1. 
Description: LO1 < rs 


The contents of GPR rs are loaded into special register LO1 (= L O127.64). 


Restrictions: 

None 
Operation: 

L O127..64 < GPR[rs]e3..o 
Exceptions: 

None 


Programming Notes: 


None 
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Format: MTSA_ rs 
Purpose: To copy a GPR to the shift amount register SA. 
Description: SA<€ rs 


The contents of GPR rs are loaded into SA, the special register storing the funnel shift 
amount. Note that rs must contain a value that was originally generated by MFSA. If 
some other user-generated value is in rs, the shifting action performed by the funnel 
shifter is not defined; that is, MTSA cannot be used to by a program to set a new funnel 
shift amount. This is because the shift amount is encoded in SA in an implementation- 
defined manner. The sole purpose of this instruction is to permit the shift amount to be 
restored during a context switch. 


Restrictions: 


Notethat the three instructions statically preceding a MTSA instruction must not read or 
write the SA register; that is, they cannot be either of the instructions MF SA, QF SRV, or 
MTSAx. 


Use the MTSAB and MTSAH instructions to set a new funnel shift amount. 


Operation: 
SA < GPRIrsle3..o 
Exceptions: 


None 


Implementation Note: 


1. MTSA updates the SA register in the A Stage. To keep exception processing simple, 
this requires that the cycle prior to MTSA not read the SA register. Also, when 
single stepping, making sure that SA always contains the value of the SA write 
instruction, just single stepped, requires that the cycle after MTSA not write the 
SA register. Both these rules are enforced by the architectural requirement that 
the three instructions prior to MTSA not read SA. 


2. TheMTSA instruction executes only in pipeline 0. 
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C790 

Format: MTSAB rs, immediate 
Purpose: To copy a GPR to the shift amount register SA. 
Description: SA € (rs xor immediate) x 8 


The least-significant four bits of GPR rs are XORed with the least-significant four bits of 
the immediate value. The resulting four bits are interpreted as a byte shift amount and 
stored into SA, the special register storing the funnel shift amount. 


Restrictions: 
The three instructions statically preceding a MTSAB instruction must not read the SA 
register; that is, they cannot be either of the instructions MFSA or QFSRV. 


Operation: 

SA < (GPRIrs]3..0 xor immediates..o) * 8 
Exceptions: 

None 
Implementation Note: 

1. MTSAB updates the SA register in the A Stage. To keep exception processing 
simple, this requires that the cycle prior to MTSAB not read the SA register. Also, 
when single stepping, making sure that SA always contains the value of the SA 
write instruction, just single stepped, requires that the cycle after the MTSAB not 
write the SA register. Both these rules are enforced by the architectural 
requirement that the three instructions prior to MTSAB not read SA. 

2. The MTSAB instruction executes only in pipeline 0. 

Programming Note: 
MTSAB allows the user to load either a variable shift amount or a fixed shift amount, as 
follows: 


mtsab 0,5 //Set shift amount to “5 bytes” 
mtsab 10, 0// Set byte shift amount to contents of GPR10 
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Format: MTSAH rs, immediate 
Purpose: To copy a GPR to the shift amount register SA. 
Description: SA € (rs xor immediate) x 16 


The least-significant three bits of GPR rs are XORed with the least-significant three bits 
of the immediate value. The resulting three bits are interpreted as a halfword shift 
amount and stored into SA, the special register storing the funnel shift amount. 


Restrictions: 
The three instructions statically preceding a MTSAB instruction must not read the SA 
register; that is, they cannot be either of the instructions MFSA or QFSRV. 


Operation: 
SA < (GPRIrs]2..0 xor immediatez..o) * 16 
Exceptions: 


None 


Implementation Note: 


1. MTSAH updates the SA register in the A Stage. To keep exception processing 
simple, this requires that the cycle prior to MTSAH not read the SA register. Also, 
when single stepping, making sure that SA always contains the value of the SA 
write instruction, just single stepped, requires that the cycle after MTSAH not 
write the SA register. Both these rules are enforced by the architectural 
requirement that the three instructions prior to MTSAH not read SA. 


2. TheMTSAH instruction executes only in pipeline O. 
Programming Note: 
MTSAH allows the user to load either a variable shift amount or a fixed shift amount, as 
follows: 


mtsah 0,5 // Set shift amount to “5 halfwords” 
mtsah 10, 0// Set halfword shift amount to value of GPR10 
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Format: MULT rd, rs, rt 
MULT rs, rt 

Purpose: To multiply 32-bit signed integers. 
Description: (rd, LO, Hl) < rs x rt 


The 32-bit value in GPR rt is multiplied by the 32-bit value in GPR rs, treating both 
operands as signed values, to produce a 64-bit result. The low-order 32-bits of the result is 
placed into special register LO and GPR rd, and the high-order 32-bit of the result is 
placed into special register H/. 


No arithmetic exception occurs under any circumstances. 


If GPR rdis omitted in assembly language, 0 is used as the default value. 


Restrictions: 


If either GPR rt or GPR rs do not contain sign-extended 32-bit values (bits 63..31 equal), 
then the result of the operation will be undefined. 


Operation: 
if (NotWordValue (GPR[rs]) or NotWordValue (GPR[rt])) then UndefinedResult() endif 
prod < GPRIrs]Js1..0 * GPR[rt]31..0 
L O¢3..0 < (prod 31)? || prod3i..o 
H1 63.0 < (prod 63)? || prodes3..32 
GPR[rd] 63..0 < (prod 31)? || prod3i..o 
Exceptions: 
None 


Programming Notes: 


In the C790, the integer multiply operation allows other CPU instructions to execute out- 
of-order. An attempt to read LO or H/ registers before the results are written will cause 
an interlock until the results are ready. Asynchronous execution does not affect the 
program result, but offers an opportunity for performance improvement by scheduling the 
multiply so that other instructions can execute in parallel. 


Programs that require overflow detection must check for it explicitly. 
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Format: MULT1 rd, rs, rt 
MULT1 rs, rt 

Purpose: To multiply 32-bit signed integers in Pipeline 1. 
Description: (rd, HI1, LO1) < rs x rt 


The 32-bit value in GPR rt is multiplied by the 32-bit value in GPR rs, treating both 
operands as signed values, to produce a 64-bit result. The low-order 32-bits of the result is 
placed into special register LO1 (=L O127..64) and GPR rd, and the high-order 32-bits of the 
result is placed into special register H/ 1 (=H1/127..64). 


No arithmetic exceptions occurs under any circumstances. 


If GPR rdis omitted in assembly language, 0 is used as the default value. 


Restrictions: 


If either GPR rt or GPR rs do not contain sign-extended 32-bit values (bits 63..31 equal), 
then the result of the operation will be undefined. 


Operation: 
if (NotWordValue (GPR[rs]) or NotWordValue (GPR[rt])) then UndefinedResult() endif 
prod < GPRIrs]Js1..0 * GPR[rt]31..0 
L O127..64 < (prod 31)? || prod 31.0 
H1 127.64 < (prod 63)? || prod 63.32 
GPR[rd]e3..o < (prod 31)? || prod3z..o 
Exceptions: 
None 


Programming Notes: 


In the C790 the integer multiply operation allows other CPU instructions to execute out- 
of-order. An attempt to read LO1 or H/1 before the results are written will cause an 
interlock until the results are ready. Asynchronous execution does not affect the program 
result, but offers an opportunity for performance improvement by scheduling the multiply 
so that other instructions can execute in parallel. 


Programs that require overflow detection must check for it explicitly. 
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Format: MULTU rd, rs, rt 
MULTU rs, rt 
Purpose: To multiply 32-bit unsigned integers. 
Description: (rd, HI, LO) < rs x rt 


The 32-bit value in GPR rt is multiplied by the 32-bit value in GPR rs, treating both 
operands as unsigned values, to produce a 64-bit result. The low-order 32-bit of the result 
is placed into special register LO and GPR rd, and the high-order 32-bits of the result is 
placed into special register H/. 


No arithmetic exception occurs under any circumstances. 


If GPR rdis omitted in assembly language, 0 is used as the default value. 


Restrictions: 


If either GPR rt or GPR rs do not contain zero-extended 32-bit values (bits 63..32 equal 
zero), then the result of the operation will be undefined. 


Operation: 
if (NotWordValue (GPR[rs]) or NotWordValue (GPR[rt])) then UndefinedResult() endif 
prod < (0]| GPR[rs]s1..0) * (0 || GPR[rt]sz..0) 
L O¢3..0 < (prod 31)? || prodsi..o 
HI 63.0 < (prod 63)? || prodes..32 
GPR[rd] 63..0 < (prod 31)? || prod3i..o 
Exceptions: 
None 


Programming Notes: 


See the Programming Notes for the MULT instruction. 
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Format: MULTU1 td, rs, rt 
MULTU1 rs, rt 
Purpose: To multiply 32-bit unsigned integers in Pipeline 1. 
Description: (rd, HI1, LO1) < rs x rt 


The 32-bit value in GPR rt is multiplied by the 32-bit value in GPR rs, treating both 
operands as unsigned values, to produce a 64-bit result. The low-order 32-bit of the result 
is placed into special register LO1 (= LOi27.64) and GPR rd, and the high-order 32-bit of 
the result is placed into special register H/1 (=H1/127.64). 


No arithmetic exceptions occurs under any circumstances. 


If GPR rdis omitted in assembly language, 0 is used as the default value. 


Restrictions: 


If either GPR rt or GPR rs do not contain zero-extended 32-bit values (bits 63..32 equal 
zero), then the result of the operation will be undefined. 


Operation: 
if (NotWordValue (GPR[rs]) or NotWordValue (GPR[rt])) then UndefinedResult() endif 
prod < (0]| GPR[Irs]s1..0) * (0 |] GPR[rt]s1..0) 
LO127..64 < (prod 31)? || prod 31.0 
H1 127.64 < (prod 63)? || prod 63..32 
GPR[rdles..o < (prod 31)? || prod 31.0 
Exceptions: 
None 


Programming Notes: 


See the Programming Notes for the MULT1 instruction. 
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Format: PABSH rd, rt 
Purpose: To calculate the absolute value of 8 16-bit integers in parallel. 
Description: rd < |rt| 


The absolute value of the eight signed halfword values in GPR rt are placed into the 
corresponding eight halfwords in GPR rd. 


This instruction operates on 128-bit registers. 


Operation: 
GPRIrdhis.o << |GPR[rths.o | 
GPR[rd]si.16 < |GPR[rt]s1.16| 
GPR[rd]a7.32 <— | GPR[rt]a7.22| 
GPRIrd]es.as_ < |GPR[rt]es..a8| 
GPR[rd]z.64 < |GPR[rt]z9..64| 
GPRIrd]ss.s0 < |GPR[rt]os..20| 
GPR[rd]i.96 < |GPR[rt]i1.96| 
GPR[rd]iz7.112 <— | GPR[rt]iz7.112| 


127 112 111 96 95 80 79 64 63 48 47 32.31 16 15 0 


qe 96 95 80 79 64 63 48 47 32 31 16 15 
ere | A6 | | A5 | | A4 | | A3 | | A2 | | Ad | a 
Supplementary explanation: 


When the halfword value in GPR rt is 0x8000 (-32768), the smallest negative value, the 
operation will result in an overflow. However, overflow exception doesn’t occur; the result 
is truncated to the largest positive number - Ox7FFF (+32767) . 


Exceptions: 


None 
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Format: PABSW rd, rt 
Purpose: To calculate the absolute value of 4 32-bit integers in parallel. 
Description: rd < Irt| 


The absolute value of the four signed word values in GPR rt are placed into the 
corresponding four words in GPR rd. 


This instruction operates on 128-bit registers. 


Operation: 
GPRIrd]z1.0 < | GPR[rt]s1.0 | 
GPRIrd]es.32 < | GPR[rtes.22 | 
GPRIrdlos.64 < | GPR[rt]os..6a | 
GPRIrdhiz7.96 <— | GPR[rt]127.96 | 


127 96 95 64 63 32 31 


oO 


96 95 64 63 32 31 
| A3 | | A2 | | At | | Ao | 
Supplementary explanation: 


When the word value of the GPR rt is equal to Ox80000000 (-2147483648), the smallest 
negative number, the operation will result in an overflow. However, if an overflow 
exception doesn’t occur; the result is truncated to the largest positive value - Ox7F FFFFFF 
(42147483647). 


Exceptions: 


None 
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Format: PADDB rd, rs, rt 
Purpose: To add 16 pairs of 8-bit integers in parallel. 
Description: rd<rs+rt 


The sixteen byte values in GPR rs are added to the corresponding sixteen byte values in 
GPR rtin parallel. The results are placed into the corresponding sixteen bytes in GPR rd. 


No overflow or underflow exceptions are generated under any circumstances. 
This instruction operates on 128-bit registers. 


Operation: 
GPRIrd]7..o < (GPRIrs]7.. +GPRIrt]7..0)7..0 
GPR[Ird]is.8 << (GPR[rs]1s5.s +GPR[rt]1s..8)7..0 
GPRIrd]23..16 GPRIrs]23..16 +GPR[rt]23..16)7..0 
GPRIrd]s1..24 GPRI[rs]s1..24 +GPR[rt]31..24)7..0 
GPRIrd]s9..32 GPRIrs]s9..32 +GPR[rt]39..32)7..0 
GPRIrd]a7..40 GPRIrs]47..40 +GPR[rt]47..40)7..0 
GPRIrdlJss..48 


GPRIrd]e3..56 
GPR[rd]v1..64 
GPR[rd]79..72 
GPRI[rd]s7..so 
GPR[rdJos..ss 
GPR[rd]103.96 < 
GPR[rd]111.104 <— 


GPRIrsl]e3..56 +GPR[rt]es..56)7..0 
GPRIrs]71..64 +GPR[rt]71..64)7..0 
GPRIrs]79..72 +GPR[rt]79..72)7..0 
GPRIrs]s7..80 +GPR[rt]s7..so)7..0 
GPRIrs]os..88 +GPR[rt]os..8s)7..0 
GPR[rs]103..96 +GPR[rt]103..96)7..0 
GPR[rs]111..104 +GPR[rt]111. .104)7..0 
GPR[rd]i19.112 < (GPR[rs]119.112 +GPR[rt]119..112)7..0 
GPR[rd]127..120 < (GPR[rs]127.120 +GPR[rt]127..120)7..0 


127_ 120119 112 111 104 103 96 95 88 87 80 79 72 71 64 63 56 55 48 47 40 39 32 31 24 23 1615 8/7 0 
cfas ae] asf afar ao] [8 [7] | soe [os [ee] 


)7..0 
) 
) 
) 
GPRIrs]55..48 +GPR[rt]55..48)7..0 
) 
) 
) 
) 


oe Te ope 


nen nnn nw DLT LT 


+ + + + + + + + + + + + + + + + 
127_ 120119 112 111 104 103 96 95 88 87 80 79 72 71 64 63 56 55 48 47 40 39 32 31 24 23 1615 87 0 
[ess] eve]ere] exelesr] v0] oo] os [er |e [es [er] = | we 


127_ 120119 112 111 104 103 96 95 88 87 80 79 72 71 64 63 56 55 48 47 40 39 32 31 24 23 11615 87 0 
Ais | Al4 | A13 | Al2 Alt A10 AQ A8 A7 

. 
Bi5 | B14 Bi3 | B12 B11 B10 BO B8 B 


Exceptions: 


None 
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C790 
Format: PADDH td, rs, rt 
Purpose: To add 8 pairs of 16-bit integers in parallel. 
Description: rd<rs+rt 


The eight halfword values in GPR rs are added to the corresponding eight halfword values 
in GPR rt in parallel. The results are placed into the corresponding eight halfwords in 
GPR rd. 


No overflow or underflow exceptions are generated under any circumstances. 


This instruction operates on 128-bit registers. 


Operation: 


rs 


rt 


rd 


GPR[rd]1s..0 
GPRIrd]s1..16 
GPR[rd]a7..32 


< (GPRIrs]1s..0 +GPR[rt]15..0)15..0 

< (GPRIrs]31.16 +GPR[rt]s1..16)15..0 

< (GPRIrs]a7.32 +GPR[rt]47..32)15..0 
GPRIrd]e3.48 < (GPR[rs]e3.48 +GPR[rt]es..48)15..0 
GPRIrd]79.64 < (GPR[rs]79.64 +GPR[rt]79..64)15..0 
GPRIrd]gs.s0 < (GPR[rs]os.80 +GPR[rt]os..80)15..0 
GPR[rd]ii1.96 < (GPR[rs]111.96 +GPR[rt]111..96)15..0 
GPR[rd]127..112 < (GPR[rs]127.112 +GPR[rt]127..112)15..0 


127 112 111 96 95 80 79 64 63 48 47 32 31 16 15 0 


+ + + + + + + + 
127 112 111 96 95 80 79 64 63 48 47 32 31 16 15 0 


Pe le,es [ss] elela ls 


127 112 111 96 95 80 79 64 63 48 47 32.31 16 15 0 


A7+B7 A6+B6 A5+B5 A4+B4 A3+B3 A2+B2 A1+B1 A0+BO 


Exceptions: 


None 
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Format: PADDSB rd, rs, rt 
Purpose: To add 16 pairs of 8-bit signed integers with saturation in parallel. 
Description: rd<rs+rt 


The sixteen signed byte values in GPR rs are added to the corresponding sixteen signed 
byte values in GPR rt in parallel. The results are placed into the corresponding sixteen 
bytes in GPR rd. 


No overflow or underflow exceptions are generated under any circumstances. Results 
beyond the range of a signed byte value are saturated according to the following: 


Overflow: Ox7F 
Underflow: 0x80 


This instruction operates on 128-bit registers. 


Operation: 
if (GPR[rs]7..0 +GPR[rt]7..0) >Ox7F) then 
GPRIrd]7.0 < Ox7F 
else if (0x100 <=(GPRIrs]7.. +GPR[rt]7..0) <0x180) then 
GPRIrd]7.0 < Ox80 


else 
GPRIrd]v.0 <— (GPRIrs]7..0 +GPR[rt]..0)7..0 
endif 
if (GPR[rs]is..3 +GPR[rt]1s..8) >Ox7F) then 
GPR[rd]is..s < Ox7F 
else if (0x100 <=(GPR[rs]is.s +GPR[rt]is.s) <0x180) then 
GPRIrdlis..s < 0x80 
else 
GPRIrdlis..8 < (GPRIrs]is.3 +GPR[rt]1s..8)7..0 
endif 
if (GPR[rs]23.16 +GPR[rt]23..16) >Ox7F) then 
GPRIrd]23..16 < Ox7F 
else if (0x100 <=(GPR[rs]23..16 +GPR[rt]23..16) <0x180) then 
GPRIrd]23..16 < 0x80 
else 
GPR[rd]23..16 < (GPRIrs]23.16 +GPRIrt]23..16)7..0 
endif 


if ((GPR[rs]31..24 +GPR[rt]31.24) >Ox7F) then 
GPRIrd]s1..24 < Ox7F 
else if (0x100 <=(GPR[rs]s1..24 +GPR[rt]s1..24) <0x180) then 


B-31 


TX 
TOSHIBA Appendix B C790-Specific Instruction Set Details SE 


GPR[rq]s1..24 < 0x80 
else 
GPRIrd]s1..24 < (GPRIrs]31.24 +GPR[rt]s1..24)7..0 
endif 
if ((GPR[rs]39..32 +GPR[rt]39..32) >Ox7F) then 
GPR[rd]s9..32 < Ox7F 
else if (0x100 <=(GPRIrs]39..32 +GPR[rt]39.32) <0x180) then 
GPRIrd]s9..32 < 0x80 
else 
GPRIrd]s9..32 < (GPRIrs]39..32 +GPR[rt]39..32)7..0 
endif 
if ((GPR[rsl]az..40 +GPR[rt]a7..40) >Ox7F) then 
GPRIrd]a7..40 < Ox7F 
else if (Ox100 <=(GPR[rs]47..40 +GPR[rt]a7..40) <0x180) then 
GPRIrd]a7..40 < 0x80 
else 
GPR[rd]a7..40 < (GPR[rs]a7..40 +GPR[rt]47..40)7..0 
endif 
if (GPR[rs]ss..48 +GPR[rt]ss..48) > Ox7F) then 
GPRIrd]Jss..48 < Ox7F 
else if (Ox100 <=(GPRIrs]s5.48 +GPRIrt]ss.48) <0x180) then 
GPRIrd]ss..48 < 0x80 
else 
GPRIrd]ss..48 < (GPR[Irs]ss.48 +GPR[rt]ss..48)7..0 
endif 
if (GPR[rs]e3.56 +GPR[rt]es.56) >Ox7F) then 
GPRIrdl]e3..56 < Ox7F 
else if (0x100 <=(GPR[rs]e3..56 +GPR[rt]e3.56) <Ox180) then 
GPRIrd]e3..56 < 0x80 
else 
GPRIrd]e3..56 < (GPRIrs]e3.56 +GPR[rt]e3..56)7..0 
endif 
if ((GPR[rs]71..64 +GPR[rt]71..64) >Ox7F) then 
GPR[rd]71..64 < Ox7F 
else if (Ox100 <=(GPRIrs]71..64 +GPR[rt]71.64) <0x180) then 
GPRIrd]71..64 < 0x80 
else 
GPR[rd]71..64 < (GPR[rs]71..64 +GPR[rt]71..64)7..0 
endif 
if ((GPR[rs]79..72 +GPR[rt]79..72) >Ox7F) then 
GPR[rd]79..72 < Ox7F 
else if (Ox100 <=(GPRIrs]79..72 +GPR[rt]79.72) <0x180) then 
GPR[rd]9..72 < 0x80 
else 
GPRIrd]79..72 < (GPRIrs]79.72 +GPRIrt]79..72)7..0 
endif 


if ((GPR[rs]s7..s0 +GPR[rt]s7..30) >Ox7F) then 
GPRIrd]Bs7..80 < Ox7F 
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else if (0x100 <=(GPRIrs]s7..20 +GPR[rt]s7..80) <0x180) then 


GPR[rd]s7..so < 0x80 
else 
GPR[rd]s7..80 < (GPRIrs]s7..80 +GPR[rt]s7..80)7..0 
endif 
if ((GPR[rs]os..s8 +GPR[rt]os..38) >Ox7F) then 
GPR[rdJos..ss < Ox7F 
else if (0x100 <=(GPR[rs]os..s8 +GPR[rt]os5..38) <Ox180) then 
GPR[rd]Jos..88 < 0x80 
else 
GPRIrd]Bos..s8 < (GPRIrs]9s..88 +GPR[rt]os..88)7..0 
endif 
if ((GPR[rs]103..06 +GPR[rt]103..96) > Ox7F ) then 
GPR [rd]103..96 < Ox7F 
else if (Ox100 <=(GPRIrs]103.96 +GPR[rt]103..96) <0x180) then 
GPR[rd]103..96 < 0x80 
else 
GPR[rd]103..96 < (GPRI[rs]103..96 +GPR[rt]103..96)7..0 
endif 
if ((GPR[rs]111..104 +GPR[rt]111..104) > Ox7F ) then 
GPR[rd]111..104 < Ox7F 
else if (Ox100 <=(GPR[rs]111..104 +GPR[rt]111..104) <Ox180) then 
GPR[rd]111..104 < 0x80 
else 
GPR[rd]111..104 © (GPR[rs]111..104 +GPR[rt]111..104)7..0 
endif 
if (GPR[rs]119..112 +GPR[rt]119..112) > Ox7F) then 
GPR[rd]119..112 < Ox7F 
else if (Ox100 <=(GPR[rs]119.112 +GPR[rt]119.112) <Ox180) then 
GPR[rd]119..112 < 0x80 
else 
GPR [rd]u19..112 < (GPR[rs]119.112 +GPR[rt]119..112)7..0 
endif 
if ((GPR[rs]127..120 +GPR[rt]127..120) >Ox7F ) then 
GPR[rd]127..120 < Ox7F 
else if (Ox100 <=(GPR[rs]127..120 +GPR[rt]127.120) <0x180) then 
GPR[rd]127..120 < 0x80 
else 
GPR[rd]127..120 < (GPR[rs]127.120 +GPR[rt]127..120)7..0 
endif 
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* Saturate to signed byte 


Exceptions: 


None 
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Format: PADDSH rd, rs, rt 
Purpose: To add 8 pairs of 16-bit signed integers with saturation in parallel. 
Description: rd<rs+rt 


The eight signed halfword values in GPR rs are added to the corresponding eight signed 
halfword values in GPR rt in parallel. The results are placed into the corresponding eight 
halfwords in GPR ra. 


No overflow or underflow exceptions are generated under any circumstances. Results 
beyond the range of a signed halfword value are saturated according to the following: 


Overflow: Ox7F FF 
Underflow: 0x8000 


This instruction operates on 128-bit registers. 


Operation: 

if ((GPR[rs]is.o +GPR[rt]1s..0) >Ox7F FF) then 
GPRIrd]1s..0 < OX7F FF 

else if (0x10000 <=(GPR[rs]15..0 +GPR[rt]15..0) <Ox18000) then 
GPRIrd]is..o < 0x8000 

else 
GPR[rd]Jis..o < (GPRIrs]is.. +GPR[rt]1s..0)15..0 

endif 

if (GPR[rs]31..16 +GPR[rt]31..16) >Ox7F FF) then 
GPRIrd]Bsi..16 < Ox7F FF 

else if (0x10000 <=(GPR[rs]s1..16 +GPR[rt]31..16) <0x18000) then 
GPR[rq]s1..16 < 0x8000 

else 
GPRI[rd]s1..16 < (GPRIrs]s1.16 +GPR[rt]s1..16)15..0 

endif 

if ((GPR[rs]47..32 +GPR[rt]47..32) >Ox7FFF) then 
GPRIrd]a7..32 < OX7F FF 

else if (0x10000 <=(GPR[rs]a7..32 +GPR[rt]47.32) <0x18000) then 
GPR[rd]az..32 < 0x8000 

else 
GPR[rd]az..32 < (GPR[rs]a7.32 +GPR[rt]a7..32)15..0 

endif 
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rs 


rt 


rd* 


if ((GPR[rsl]e3..48 +GPR[rt]e3.48) > Ox7F FF) then 


GPRIrd]es3..48 < OXx7F FF 
else if (0x10000 <=(GPR[rs]e3..48 +GPR[rt]e3..48) <0x18000) then 
GPR[rd]Jes..48 < 0x8000 
else 
GPRIrd]es..48 < (GPRIrs]e3.48 +GPR[rt]e3..48)15..0 
endif 
if ((GPR[rs]79..64 +GPR[rt]79..64) >Ox7F FF) then 
GPRIrd]79..64 < OX7F FF 
else if (0x10000 <=(GPR[rs]79..64 +GPR[rt]79..64) <0x18000) then 
GPR[rd]79..64 < 0x8000 
else 
GPRIrd]79..64 < (GPRIrs]79..64 +GPR[rt]79..64)15..0 
endif 
if ((GPR[rs]9s..s0 +GPR[rt]os..0) >Ox7F FF) then 
GPRIrd]Jos..80 < OX7F FF 
else if (0Ox10000 <=(GPR[rs]os..20 +GPRI[rt]95..80) <0x18000) then 
GPR[rd]os..8o < 0x8000 
else 
GPRIrd]os..80 < (GPRIrs]os.80 +GPR[rt]os..80)15..0 
endif 
if ((GPR[rs]111.96 +GPR[rt]111.96) >Ox7F FF) then 
GPRIrd]111..96 < Ox7F FF 
else if (0x10000 <=(GPR[rs]111..96 +GPR[rt]111.96) <Ox18000) then 
GPR [rd]111..96 < 0x8000 
else 
GPRIrd]111..96 < (GPR[rs]111..96 +GPR[rt]111..96)15..0 
endif 
if ((GPR[rs]127.112 +GPR[rt]127..112) >Ox7F FF) then 
GPR[rd]127..112 < OX7F FF 
else if (Ox10000 <=(GPR[rs]127..112 +GPR[rt]127..112) <0x18000) then 
GPR[rd]127..112 + 0x8000 
else 
GPR[rd]127..112 < (GPR[rs]127..112 +GPR[rt]127..112)15..0 
endif 
127 112 111 96 95 80 79 64 63 48 47 32 31 16 15 0 
+ + + + + + + + 
127 112 111 96 95 80 79 64 63 48 47 32 31 16 15 0 
rele j=[s[slela ls 
127 112 111 96 95 80 79 64 63 48 47 32 31 16 15 0 


A7+B7 A6+B6 | A5+B5 | A4+B4 A3+B3 A2+B2 A1i+B1 | A0+BO 


* Saturate to signed halfword 


Exceptions: 


None 
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Format: PADDSW td, rs, rt 
Purpose: To add 4 pairs of 32-bit signed integers with saturation in parallel. 
Description: rd<rs+rt 


The four signed word values in GPR rs are added to the corresponding four signed word 
values in GPR rtin parallel. The results are placed into to the corresponding four words in 
GPR rd. 


No overflow or underflow exceptions are generated under any circumstances. Results 
beyond the range of a signed word value are saturated according to the following: 


Overflow: Ox7FFFFFFF 
Underflow: 0x80000000 


This instruction operates on 128-bit registers. 


Operation: 

if (GPR[rs]s31..0 +GPR[rt]s1..0) >Ox7F FF FFFF) then 
GPR[rd]Bsi..o < Ox7FFFFFFF 

else if (Ox100000000 <= (GPRIrs]31..0 +GPR[rt]31..0) <0x180000000) then 
GPR[rd]a1..0 < 0x80000000 

else 
GPR[rd]a1..0 < (GPRIrs]s1..0 +GPR[rt]s1..0)31..0 

endif 

if ((GPR[rs]e3..32 +GPR[rt]e3..32) >Ox7F FF FFFF) then 
GPRIrdl]e3..32 < Ox7FFFFFFF 

else if (0x100000000 <=(GPRIrs]e3..32 +GPR[rt]e3.32) <O0x180000000) then 
GPRIrd]e3..32 < 0x80000000 

else 
GPR [rd]e3..32 < (GPRIrs]e3.32 +GPR[rt]e3..32)31..0 

endif 

if ((GPR[rs]os..64 +GPR[rt]os5..64) >Ox7F FFFFFF) then 
GPRIrd]JBos..64 < Ox7FFFFFFF 

else if (Ox100000000 <=(GPRIrsJos..64 +GPR[rt]os..64) <Ox180000000) then 
GPR[rd]Jos..64 < 0x80000000 

else 
GPR[rdJos..64 < (GPRIrs]os..64 +GPR[rt]os..64)31..0 

endif 
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if ((GPR[rs]127..06 +GPR[rt]127..96) > Ox7F FFF FFF) then 


GPR[rd]127..96 < Ox7FFFFFFF 

else if (0x100000000 <=(GPRI[rs]127..96 +GPR[rt]127..96) <0x180000000) then 
GPR[rd]127..96 <+ 0x80000000 

else 
GPR[rd]127..96 < (GPRIrs]i27.96 +GPR[rt]127..96)31..0 

endif 

127 96 95 64 63 382 31 0 

: 
+ + + + 
127 96 95 64 63 32 31 0 


‘Es | # [= [| =» 
127 96 95 64 63 32 31 0 
rd* A3+B3 A2+B2 A1+B1 A0+BO 


* Saturate to signed word 


Exceptions: 


None 
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aaa B Parallel Add with Unsigned saturation Byte Pee B 
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Format: PADDUB rd, rs, rt 
Purpose: To add 16 pairs of 8-bit unsigned integers with saturation in parallel. 
Description: rd<rs+rt 


The sixteen unsigned byte values in GPR rs are added to the corresponding sixteen 
unsigned byte values in GPR rt in parallel. The results are placed into the corresponding 
sixteen bytes in GPR ra. 


No overflow exceptions are generated under any circumstances. Results beyond the range 
of an unsigned byte value are saturated according to the following: 


Overflow: OxF F 


This instruction operates on 128-bit registers. 


Operation: 


if (GPR[rs]z7..0 +GPR[rt]z..0) >OxFF) then 
GPRIrd]7.0 < OxFF 


else 
GPRIrd]v.0 <— (GPRIrs]7..0 +GPR[rt]..0)7..0 
endif 
if ((GPR[rs]is.. +GPR[rt]is.s) >OxFF) then 
GPRIrd]is..8 < OxFF 
else 
GPR[rd]Js..s < (GPRIrs]1s..3 +GPR[rt]1s..8)7..0 
endif 
if (GPR[rs]23..16 +GPR[rt]23..16) >OxFF) then 
GPR[rd]23..16 < OxFF 
else 
GPR[rd]z3..16 < (GPR[rsh3.16 +GPR[rt]23..16)7..0 
endif 
if ((GPR[rs]31..24 +GPR[rt]31.24) >OxFF) then 
GPRIrd]Bs1..24 < OxFF 
else 
GPR[rd]s1..24 < (GPRIrs]s1..24 +GPR[rt]s1..24)7..0 
endif 
if ((GPR[rs]39..32 +GPR[rt]39..32) >OxFF) then 
GPR[rd]Bs9..32 < OxFF 
else 
GPR [rd]s9..32 < (GPRIrs]39..32 +GPR[rt]39..32)7..0 
endif 
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if ((GPR[rsl]az..40 +GPR[rt]a7..40) > OxFF) then 


GPR[rd]az..40 < OxFF 
else 
GPRIrd]az..40 < (GPRIrs]a7.40 +GPRIrt]47..40)7..0 
endif 
if (GPR[rs]ss..48 +GPR[rt]ss..48) > OxFF) then 
GPRIrd]ss..48 < OxFF 
else 
GPR[rd]ss..48 < (GPRIrsJss..48 +GPRI[rt]ss..48)7..0 
endif 
if ((GPR[rs]e3..56 +GPR[rt]e3..56) >OxFF) then 
GPRIrd]e3.56 < OxFF 
else 
GPRIrd]e3..56 < (GPRIrs]e3.56 +GPR[rt]e3..56)7..0 
endif 
if ((GPR[rs]71..64 +GPR[rt]71..64) >OxFF) then 
GPRIrd]v1..64 < OxFF 
else 
GPR[rd]71..64 < (GPRIrs]71.64 +GPR[rt]71..64)7..0 
endif 
if ((GPR[rs]v9..72 +GPR[rt]79..72) >OxFF) then 
GPR[rd]v9..72 < OxFF 
else 
GPR[rd}vs..72 < (GPRI[rs]v9..72 +GPR[rt]79..72)7..0 
endif 
if ((GPR[rs]s7..30 +GPR[rt]s7..30) >OxFF) then 
GPRIrd]s7..80 < OxFF 
else 
GPR[rd]sz..80 < (GPRIrs]s7.80 +GPR[rt]s7..80)7..0 
endif 
if ((GPR[rs]os..88 +GPR[rt]os..38) >OxFF) then 
GPR[rdJos..s8 < OxFF 
else 
GPR[rd]os..88 < (GPRIrs]os..88 +GPR[rt]os..88)7..0 
endif 
if ((GPR[rs]io3..06 +GPR[rt]103..96) >OxFF) then 
GPR[rd]103..96 < OxFF 
else 
GPR [rd]1o3..96 < (GPRI[rs]103.96 +GPR[rt]103..96)7..0 
endif 
if ((GPR[rs]i11..104 +GPR[rt]111..104) > OxFF) then 
GPR[rd]111..104 < OxFF 
else 
GPR[rd]111..104 < (GPR[rs]111..104 +GPR[rt]111..104)7..0 
endif 


if ((GPR[rs]119..112 +GPR[rt]119..112) > OxFF) then 
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GPR[rd]119..112 < OxFF 
else 
GPR[rd]119..112 < (GPR[rs]119..112 +GPR[rt]119..112)7..0 
endif 
if ((GPR[rs]127..120 +GPR[rt]127..120) > OxFF) then 
GPR[rd]127..120 < OxFF 
else 
GPR[rd]127..120 < (GPR[rs]127..120 +GPR[rt]127..120)7..0 
endif 


127_ 120119 112 111104103 96 95 88 87 80 79 72 71 64 63 56 55 48 47 40 39 32 31 2423 1615 87 0 


efus{ae[aif area [wo] a] 8 [ar [oe [as [as] [oe | N [ 


+ + + + + + + + + 
127_ 120119 112 111104103 96 95 88 87 80 79 72 71 64 63 56 55 48 47 40 39 32 31 2423 


+ + 
1615 87 0 
“sis | 814/510 | 12 [811] 810] 69 | 8 | 87 | 86 | 8s | 


EICicics 


127_ 120119 112 111104103 96 95 88 87 80 79 72 71 64 63 56 55 48 47 40 39 32 a fA 7s 


38 is = i 


* Saturate to meas me 
Exceptions: 


None 


B-41 


TX 
TOSHIBA Appendix B C790-Specific Instruction Set Details We 


amar H Parallel Add with Unsigned saturation Halfword Pee H 
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Format: PADDUH 1d, rs, rt 
Purpose: To add 8 pairs of 16-bit unsigned integers with saturation in parallel. 
Description: rd<rs+rt 


The eight unsigned halfword values in GPR rs are added to the corresponding eight 
unsigned halfword values in GPR rt in parallel. The results are placed into the 
corresponding eight halfwords in GPR rd. 


No overflow exceptions are generated under any circumstances. Results beyond the range 
of an unsigned halfword value are saturated according to the following: 


Overflow: OxF F F F 


This instruction operates on 128-bit registers. 


Operation: 
if ((GPR[rs]is..0 +GPR[rt]1s..0) >OxFFFF) then 
GPRIrd]1s..0 < OxFFFF 
else 
GPR[rd]Jis..o < (GPRIrs]is.. +GPR[rt]1s..0)15..0 
endif 
if (GPR[rs]31..16 +GPR[rt]31.16) >OxFFFF) then 
GPRIrd]Bs1..16 < OxFFFF 
else 
GPRI[rd]s1..16 < (GPRIrs]s1.16 +GPR[rt]s1..16)15..0 
endif 
if ((GPR[rs]a7..32 +GPR[rt]a7.32) >OxFFFF) then 
GPRIrd]a7..32 < OxFFFF 
else 
GPR[rd]az..32 < (GPRIrs]a7.32 +GPR[rt]a7..32)15..0 
endif 
if (GPR[rsle3..48 +GPR[rtles..48) > OxFFFF) then 
GPRIrdl]e3..48 < OxFFFF 
else 
GPR[rd]e3..a8 < (GPRIrsle3.48 +GPR[rt]e3..48)15..0 
endif 
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if ((GPR[rsl]79..64 +GPR[rt]79..64) > OxFFFF) then 


GPR[rd]79..64 < OxFFFF 
else 
GPR[rd]vo9..64 < (GPR[rs]79..64 +GPR[rt]79..64)15..0 
endif 
if ((GPR[rs]os..0 +GPR[rt]os..20) >OxFFFF) then 
GPRI[rd]os..80 < OxFFFF 
else 
GPRIrd]os..80 < (GPRIrs]os..80 +GPR[rt]9s..80)15..0 
endif 
if (GPR[rs]ii1..06 +GPR[rt]111.96) >OxFFFF) then 
GPR[rd]111..96 < OxFFFF 
else 
GPR[rd]111..96 < (GPRI[rs]111..96 +GPR[rtJ111..96)15..0 
endif 
if ((GPR[rs]i27..112 +GPR[rt]127.112) > OxF FFF) then 
GPR[rd]127..112 < OxFFFF 
else 
GPR[rd]127..112 © (GPRIrs]i27.112 +GPR[rt]127..112)15..0 
endif 
127 112 111 96 95 80 79 64 63 48 47 32 31 16 15 0 
+ + + + + + + + 
127 112 111 96 95 80 79 64 63 48 47 32 31 16 15 0 
Pe] ) es | |e] ] ao] a 
127 112 111 96 95 80 79 64 63 48 47 32 31 16 15 0 


rd* | A7+B7 A6+B6 | A5+B5 A4+B4 A3+B3 A2+B2 Ai+B1 |} A0+BO 


* Saturate to unsigned halfword 


Exceptions: 


None 
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Ler DUW Parallel Add with Unsigned saturation Word er UW 
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Format: PADDUW 1d, ‘rs, rt 
Purpose: To add 4 pairs of 32-bit unsigned integers with saturation in parallel. 
Description: rd<rs+rt 


The four unsigned word values in GPR rs are added to the corresponding four unsigned 
word values in GPR rt in parallel. The results are placed into the corresponding four 
words in GPR rd. 


No overflow exceptions are generated under any circumstances. Results beyond the range 
of an unsigned word value are saturated according to the following: 


Overflow: OxF FF FF FFF 


This instruction operates on 128-bit registers. 


Operation: 
if (GPR[rs]31.0 +GPR[rt]31..0) >OxFFFFFFFF) then 
GPRI[rd]Bsi..o < OxFFFFFFFF 
else 
GPR[rd]a1.0 < (GPRIrs]31..0 +GPR[rt]s1..0)31..0 
endif 
if ((GPR[rs]e3..32 +GPRIrt]e3..32) >OxFFFFFFFF) then 
GPRIrdl]e3..32 < OxFFFFFFFF 
else 
GPR [rd]e3..32 < (GPRIrs]e3.32 +GPRI[rt]e3..32)31..0 
endif 
if ((GPR[rs]os..64 +GPR[rt]os..64) >OxFFFFFFFF) then 
GPRIrd]Jos..64 < OxFFFFFFFF 
else 
GPRIrdJos..64 < (GPRIrs]os..64 +GPR[rt]os..64)31..0 
endif 
if ((GPR[rs]127.96 +GPR[rt]127.96) >OxFFFFFFFF) then 
GPRIrd]127..96 < OxFFFFFFFF 
else 
GPR [rd]127..96 < (GPRI[rs]127.96 +GPR[rt]127..96)31..0 
endif 


B-44 


TX 
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127 96 95 64 63 32 31 0 
+ + + 
127 96 95 64 63 32 31 0 


127 96 95 64 63 32 31 
rd* A3+B3 A2+B2 A1+B1 A0+B0 


oO 


* Saturate to unsigned word 


Exceptions: 


None 
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ADDW Parallel Add Word Pp fos DW 


26 25 21 20 16 15 11 10 
FEARS or 
011 : 00 cai ze 000 
C790 
Format: PADDW rd, rs, rt 
Purpose: To add 4 pairs of 32-bit integers in parallel. 
Description: rd<rs+rt 


The four word values in GPR rs are added to the corresponding four word values in GPR 
rtin parallel. The results are placed into the corresponding four words in GPR rd. 


No overflow or underflow exceptions are generated under any circumstances. 


This instruction operates on 128-bit registers. 


Operation: 


GPRIrd]s1..0 

GPR[Ird]e3.32 <— 
GPRIrd]os.64 <— 
GPR[rd]127..96 <— 


< (GPRIrs]s1..0 +GPR[rt]s1..0)31..0 
GPRIrs]e3..32 +GPR[rt]e3..32)31..0 
GPRIrs]os..64 +GPR[rt]os..64)31..0 
GPRIrs]127..96 +GPR[rt]127..96)31..0 


ma Ne 


127 96 95 64 63 32 31 


0 

+ + + 
127 96 95 64 63 32 31 0 
127 96 95 64 63 32 31 0 


rd A3+B3 A2+B2 A1+B1 A0+B0 


Exceptions: 


None 


TX 
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e ADSBH Parallel Add/Subtract Halfword P aoe BH 


26 25 21 20 16 15 11. 10 
EARSescoe 
011100 a 101000 
C790 
Format: PADSBH 1d, rs, rt 
Purpose: To add/subtract 8 pairs of 16-bit integers in parallel. 
Description: rd< rs +/—rt 


The high-order four halfword values in GPR rs are added to the corresponding four 
halfword values in GPR rt and the low-order four halfword values in GPR rt are 
subtracted from the corresponding four halfword values in GPR rsin parallel. The results 
are placed into the corresponding eight halfword values in GPR rd. 


No overflow or underflow exceptions are generated under any circumstances. 


This instruction operates on 128-bit registers. 


Operation 


GPR[rd]is.o <— (GPR[Irs]is.o— GPR[rt]1s..0)15..0 
GPRIrd]s1.16 < (GPR[Irs]31.16— GPR[rt]s1..16)15..0 
GPRIrd]a7.32 < (GPR[rs]a7.32— GPR[rt]a7..32)15..0 
GPRIrd]e3.48 < (GPRI[rsl]e3.48-— GPR[rt]es..48)15..0 
GPR[rd]v9.64 < (GPR[rs]79.64+ GPR[rt]79..64)15..0 
GPRIrdJos.s0 < (GPRI[rs]9s.80+ GPR[rt]os..80)15..0 
GPR[Ird]i11.96 < (GPR[Irs]111.96 + GPR[rt]111..96)15..0 
( 


GPRIrd]127.112 < (GPR[rs]i27.112 + GPR[rt]127..112)15..0 


127 112 111 96 95 80 79 64 63 48 47 32. 31 16 15 0 


+ + + + - - - - 
127 112 111 96 95 80 79 64 63 48 47 32 31 16 15 0 


fe lelels|s,els [| s 


127 112 111 96 95 80 79 64 63 48 47 32_31 16 15 0 


rd A7+B7 A6+B6 A5+B5 | A4+B4 A3-B3 A2-B2 Ai-B1 | A0-BO 


Exceptions: 


None 


TX 
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PAND Parallel And PAND 


31 26 25 21 20 16 15 11 10 6 5 0 
MMI PAND MMI2 
6 5 5 5 5 6 
C790 
Format: PAND rd, rs, rt 
Purpose: To perform a bitwise logical AND. 
Description: rd <— rs AND rt 


The contents of GPR rs are combined with the contents of GPR rt in a bitwise logical AND 
operation. The result is placed into GPR rd. 


This instruction operates on 128-bit registers. 


Operation: 
GPRIrd]i27.0 < GPR[rs]i27..0 and GPR[rt]127..0 


127 64 63 0 
AND AND 
127 64 63 0 


4 
ine) 
NJ 
(ep) 
KR 
Q 
wo 
oO 


rd A1 AND B1 AO AND BO 


Exceptions: 


None 
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ia Parallel Compare for Equal Byte eee 


26 25 21 20 16 15 11. 10 
ote Te Ts Tse] 
011100 cl 101000 
C790 
Format: PCEQB rd, rs, rt 
Purpose: To record the result of 16 equality comparisons in parallel. 
Description: rd < (rs = rt) 


The sixteen signed byte values in GPR rs are compared to the corresponding sixteen 
signed byte values in GPR rt, in parallel. The results of the comparison are placed into 
GPR rd as follows: 


If the signed byte value in GPR rsis equal to the corresponding signed byte value in GPR 
rt, then the corresponding byte in GPR rd is set to OxFF otherwise it is set to 0x00. 


This instruction operates on 128-bit registers. 


Operation: 


if (GPR[rs]7.0 =GPR[rt]7..0) then 
GPR[Ird]7.0< 18 

else 
GPRIrd]7.0 < 08 

endif 


if (GPR[rs]15.3 =GPR[rt]1s..s) then 
GPR[rdJis.s < 18 

else 
GPR[rd]Jis.s < 08 

endif 


if (GPR[rs]23.16 =GPR[rt]23..16) then 
GPR[rd]23.16 < 18 

else 
GPR[rd]23..16 — 08 

endif 


if (GPR[Irs]s1..24 =GPR[rt]s1..24) then 
GPR[rd]s1..24 < 18 

else 
GPR[rd]s1..24 < 08 

endif 
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if (GPRIrs]39.32 =GPR[rt]39..32) then 
GPR[rd]s9..32 < 18 

else 
GPR[rd]39..32 < 08 

endif 


if (GPRIrs]47.40 =GPR[rt]a7..40) then 
GPR[Ird]47..40 < 18 

else 
GPR[Ird]47..40 < 08 

endif 


if (GPR[rs]s5.48 =GPRIrt]ss..48) then 
GPRIrd]ss..48 <— 18 

else 
GPRIrd]ss..48 — 08 

endif 


if (GPR[rs]e3.56 =GPRIrt]e3..56) then 
GPR[rd]e3.56 << 18 

else 
GPR[rd]e3..56 < 08 

endif 


if (GPRIrs]71.64 =GPR[rt]71..64) then 
GPR[rd]v1..64 < 18 

else 
GPR[rd]71..64 — 08 

endif 


if (GPR[rs]79.72 =GPRIrt]79..72) then 
GPR[rd]79..72 < 18 

else 
GPR[rd]79..72 < 08 

endif 


if (GPR[rs]s7..30 =GPR[rt]s7..s0) then 
GPR[rd]sz..80 < 18 

else 
GPR[rd]sz..s0 < 08 

endif 


if (GPR[rs]os.38 =GPRIrt]os..s8) then 
GPR[rd]Jos..s8 < 18 

else 
GPR[rd]Jos..s8 — 08 

endif 


if (GP R[rs]103..06 =GPR[rt]103..96) then 
GPR[rd]J1o3..96 < 18 

else 
GPR [rd]J1o3..96 < 08 

endif 


if (GPR[rs]111..104 =GPR[rt]111..104) then 
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GPR[rd]111..104 — 18 
else 

GPR[rd]111..104 — 08 
endif 


if (GPR[rs]119.112 =GPR[rt]119..112) then 
GPR[rd]i19..112 + 18 

else 
GPR[Ird]iis..112 < 08 

endif 


if (GPR[rs]127.120 = GPR [rt]127..120) then 
GPR [rd]127..120 + 18 

else 
GPR[rd]127..120 < 08 

endif 


127_ 120119 112 111 104 103 96 95 88 87 80 79 72 71 64 63 56 55 48 47 40 39 32 31 24 23 1615 8/7 0 
cfaisf ae] aif aefan ao] ee] aT] as [os |e. 


127 (120119 112 114 104 103 96 95 88 87 80 79 72 71 64 63 56 55 48 4740 39 32 31 “24 23 16 15. 8 7 S 0 
[ess] eve] eva| evalesr] v0] oo | oo [or | oo [es [or |= | we 


False True True True True False False True False True True True True False False’ True 
127_ 120119 112 111 104 103 96 95 88 87 80 79 72 71 64 63 56 55 48 47 40 39 32 31 24 23 1615 87 0 


Exceptions: 


None 
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a EQH Parallel Compare for Equal Halfword eo EQH 


26 25 21 20 16 15 11 10 
FAGsGseeede 
wee cal Salle 
C790 
Format: PCEQH rd, rs, rt 
Purpose: To record the results of 8 equality comparisons in parallel. 
Description: rd < (rs = rt) 


The eight signed halfword values in GPR rs are compared to the corresponding eight 
signed halfword values in GPR rt, in parallel. The results of the comparison are placed 
into GPR rdas follows: 


If the signed halfword value in GPR rsis equal to the corresponding signed halfword value 
in GPR rt, then the corresponding halfword in GPR rd is set to OxFFFF otherwise it is set 
to 0x0000. 


This instruction operates on 128-bit registers. 


Operation: 


if (GPRIrs]is.0 =GPR[rt]15..0) then 
GPRIrd]is..o <— 116 

else 
GPRIrdlis..o ~ 016 

endif 


if (GPR[rs]31.16 =GPR[rt]s1..16) then 
GPR[rd]s1..16 < 126 

else 
GPR[rd]31..16 < 026 

endif 


if (GPR[rs]a7.32 =GPR[rt]a7..32) then 
GPR[rd]az7..32 < 116 

else 
GPR[rd]az..32 < 026 

endif 


if (GPRIrs]e3.48 =GPRIrt]es3..48) then 
GPR[rd]e3.48 < 116 

else 
GPR[rd]e3.48 < 026 

endif 
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if (GPRIrs]79.64 =GPR[rt]79..64) then 
GPR[rd]79.64 < 116 

else 
GPR[rd]79.64 < 026 

endif 


if (GPR[Irs]os.30 =GPR[rt]os..80) then 
GPR[rd]os.80 < 116 

else 
GPR[rd]Jos..80 < 026 

endif 


if (GPR[rs]111.96 =GPR[rt]111..96) then 
GPR[rd]Jiiz..96 < 116 

else 
GPR[rd]J111..96 < O16 

endif 


if (GPR[rs]127..112 =GPR[rt]127..112) then 
GPR[rd]i27..112 <— 116 


else 
GPR[rd]127..112 — 016 
endif 
127 112 111. 96 95 80 79 64 63 48 47 32 31 16 15 0 


: 


127.112 111.96 95 8079 6463 4847 3231 16 15 0 


False Tru False True False True False True 


e 
127 112 111 96 95 80 79 64 63 48 47 32_31 16 15 0 


Exceptions: 


None 
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a EQW Parallel Compare for Equal Word bs EQW 


26 25 21 20 16 15 11 10 
FArSrsescoe 
wee ca Salle 
C790 
Format: PCEQW rd, rs, rt 
Purpose: To record the result of 4 equality comparisons in parallel. 
Description: rd < (rs = rt) 


The four signed word values in GPR rs are compared to the corresponding four signed 
word values in GPR rt, in parallel. The results of the comparison are placed into GPR rd 
as follows: 


If the signed word value in GPR rsis equal to the corresponding signed word value in GPR 
rt, then the corresponding word in GPR rd is set to OXFFFFFFFF otherwise it is set to 
0x00000000. 


This instruction operates on 128-bit registers. 


Operation: 


if (GPR[rs]s1..0 =GPR[rt]s1..0) then 
GPR[rd]s1..0 < 122 

else 
GPR[rq]s1..0 — 072 

endif 


if (GPRIrs]e3.32 =GPR[rt]e3..32) then 
GPR[rd]e3..32 <— 122 

else 
GPR[rd]e3..32 <— 072 

endif 


if (GPRIrsl]os.64 =GPR[rt]os..64) then 
GPR[rd]Jos..64 < 132 

else 
GPR [rd]Jos..64 — 022 

endif 


if (GPR[rs]i27.96 =GPR[rt]127..96) then 
GPR[rd]127..06 — 1°2 

else 
GPR[rd]127..06 — 072 

endif 
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127 96 95 64 63 32 31 0 
127 Zz 96 95 a 64 63 = 32 31 az 0 


fa a a ee 


False True False True 


127 96 95 64 63 32.31 0 
: 


Exceptions: 


None 


TX 
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sails Parallel Compare for Greater Than Byte Ee 


26 25 21 20 16 15 11 10 
foto Te Ts ae] a 
wee ee ia 
C790 
Format: PCGTB rd, rs, rt 
Purpose: To record the result of 16 greater-than comparisons in parallel. 
Description: rd < (rs > rt) 


The sixteen signed byte values in GPR rs are compared to the corresponding sixteen 
signed byte values in GPR rt in parallel. The results of the comparison are placed into 
GPR rd as follows: 


If the signed byte value in GPR rs is greater than the corresponding signed byte value in 
GPR rt, then the corresponding byte in GPR rdis set to OxFF otherwise it is set to 0x00. 


This instruction operates on 128-bit registers. 


Operation: 


if (GPRIrs]7.0> GPR[rt]7..0) then 
GPR[rd]7.0< 18 

else 
GPR[rd]7.0 < 08 

endif 


if (GPR[rs]1s.3> GPR[rt]is..s) then 
GPRIrdJis.s < 18 

else 
GPR[rd]Jis.s < 08 

endif 


if (GPRIrs]23.16 > GPR[rt]23..16) then 
GPR[rd]23.16 < 18 

else 
GPR[rd]23..16 < 08 

endif 


if (GPR[rs]31.24 > GPR[rt]s31.24) then 
GPR[rd]s1..24 < 18 

else 
GPR[rd]si..24 < 08 

endif 
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if (GPR[rs]39..32 > GPR[rt]39..32) then 
GPR[rd]s9..32 < 18 

else 
GPR[rd]s9..32 < 08 

endif 


if (GPRIrs]47.40 > GPR[rt]a7..40) then 
GPR[rd]az..40 < 18 

else 
GPRIrd]47..40 < 08 

endif 


if (GPR[Irs]s5.48 > GPR[rt]ss..48) then 
GPRIrd]ss..48 <— 18 

else 
GPR[rd]ss..48 < 08 

endif 


if (GPR[rs]e3.56> GPR[rt]e3.56) then 
GPR[rd]e3.56 << 18 

else 
GPR[rd]e3..56 < 08 

endif 


if (GPR[rs]71.64 > GPR[rt]71.64) then 
GPR[Ird]v1..64 < 18 

else 
GPR[rd]71..64 < 08 

endif 


if (GPR[rs]79..72 > GPR[rt]79.72) then 
GPR[rd]79..72 < 18 

else 
GPR[rd]79..72 — 08 

endif 


if (GPRIrs]s7..30 > GPR[rt]s7..80) then 
GPR[rd]sz..80 < 18 

else 
GPR[rd]sz..80 < 08 

endif 


if (GPR[rs]os.88 > GPR[rt]os.88) then 
GPR[rd]Jos..8 < 18 

else 
GPR[rd]Jos..s8 < 08 

endif 


B-57 


TOSHIBA Appendix B C790-Specific Instruction Set Details ix 


if (GPR[rs]103.96 > GPR[rt]103..96) then 
GPR [rd]1o3..96 < 18 

else 
GPR [rd]Jo3..96 < 08 

endif 


if (GPR[rs]111..104 > GPR[rt]111..104) then 
GPR[rd]111..104 + 18 

else 
GPR[rd]1i1..104 < 08 

endif 


if (GPR[rs]119.112 > GPR[rt]i19.112) then 
GPR[rd]i19..112 + 18 

else 
GPR[rd]iis..112 < 08 

endif 


if (GPR[rs]127.120 > GPR[rt]127.120) then 
GPR[rd]127..120 + 18 

else 
GPR[rd]127..120 < 08 

endif 


127_ 120119 112 111 104 103 96 95 88 87 80 79 72 71 64 63 56 55 48 47 40 39 32 31 24 23 1615 8/7 0 
efas] ne] ai wefan [ao] [8 [7] | soe |S |e 


> > > > > > > > > > > > > > > > 
127_ 120119 112 111 104 103 96 95 88 87 80 79 72 71 64 63 56 55 48 47 40 39 32 31 24 23 1615 87 0 


«[ess|eve]exe| eva[esr] eso] oo | oo [er | eles [er] | [er 


True False False False False True False False True False False False’ False 
127_ 120119 112 111 104 103 96 95 88 87 80 79 72 71 64 63 56 55 48 47 40 39 32 31 24 23 1615 87 


delelelelelelelel*lelelelelelele 


True False ie 


Exceptions: 


None 
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amt Parallel Compare for Greater Than Halfword ill 


26 25 21 20 16 15 11 10 
FAGSGeescoe 
wee cae ia 
C790 
Format: PCGTH rd, rs, rt 
Purpose: To record the results of 8 greater-than comparisons in parallel. 
Description: rd < (rs > rt) 


The eight signed halfword values in GPR rs are compared to the corresponding eight 
signed halfword values in GPR rt in parallel. The results of the comparison are placed into 
GPR rd as follows: 


If the signed halfword value in GPR rsis greater than the corresponding signed halfword 
value in GPR rt, then the corresponding halfword in GPR rdiis set to OxFFFF otherwise it 
is set to Ox0000. 


This instruction operates on 128-bit registers. 


Operation: 


if (GPRIrs]is.o> GPR[rt]1s..0) then 
GPRIrd]is..o <— 116 

else 
GPRIrdlis..o ~ 016 

endif 


if (GPR[rs]31.16 > GPR[rt]31.16) then 
GPR[rd]s1.16 < 116 

else 
GPR[rd]s1..16 < 026 

endif 


if (GPR[rs]a7.32 > GPR[rt]47.32) then 
GPR[rd]az7.32 < 116 

else 
GPR[rd]a7..32 < 026 

endif 


if (GPR[rs]e3.48 > GPR[rt]e3..48) then 
GPR[rd]e3.48 < 116 

else 
GPR[rd]e3.48 < 026 

endif 
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if (GPR[rs]79.64 > GPR[rt]79..64) then 
GPR[rd]79.64 — 116 

else 
GPR[rd]79.64 < 016 

endif 


if (GPR[Irs]os..30 > GPR[rt]os..80) then 
GPR[rd]os..80 < 116 

else 
GPR[rd]os..80 < 026 

endif 


if (GPR[rs]i11.96 > GPR[rt]i11.96) then 
GPR[rd]1i1..906 < 116 

else 
GPR[rd]J1i1..96 < O16 

endif 


if (GPR[rs]127..112 > GPR[rt]127..112) then 
GPR[rd]i27..112 <— 116 


else 
GPR[rd]127..112 — 016 
endif 
127 112 111. 96 95 80 79 64 63 48 47 32 31 16 15 0 
; 


> > > > > > > > 
127 112 111 96 95 80 79 64 63 48 47 32_31 16 15 0 


fe lefe~s|elela ls 


True False False False True False False False 
127 112 111 96 95 80 79 64 63 48 47 32 31 16 15 


Exceptions: 


None 
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amd Parallel Compare for Greater Than Word ols 


26 25 21 20 16 15 11 10 
FArSQSes coe 
saa sa ia 
C790 
Format: PCGTW rd, rs, rt 
Purpose: To record the results of 4 greater-than comparisons in parallel. 
Description: rd < (rs > rt) 


The four signed word values in GPR rs are compared to the corresponding four signed 
word values in GPR rtin parallel. The results of the comparison are placed into GPR rd as 
follows: 


If the signed word value in GPR rsis greater than the corresponding signed word value in 
GPR rt, then the corresponding word in GPR rd is set OxFFFFFFFF otherwise it is set to 
0x00000000. 


This instruction operates on 128-bit registers. 


Operation: 


if (GPRIrs]31.0> GPR[rt]31..0) then 
GPR[rd]s1..0< 122 

else 
GPR[rd]s1..0 <— 072 

endif 


if (GPRIrs]e3.32 > GPRIrt]e3.32) then 
GPR[rd]e3..32 — 122 

else 
GPR[rd]e3..32 <— 072 

endif 


if (GPR[rsl]os.64 > GPR[rt]os..64) then 
GPR[rd]Jos..64 <— 132 

else 
GPR[rd]os..64 — 072 

endif 


if (GPR[rs]i27.96 > GPR[rt]127..96) then 
GPR[rd]i27.06 — 1°2 

else 
GPR [rd]127..06 — 072 

endif 
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127 96 95 64 63 32 31 0 
> > > > 
127 96 95 64 63 32 31 0 


fe a a ee 


False True False True 


127 96 95 64 63 32.31 0 
: 


Exception: 


None 
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nye Parallel Copy Halfword one 


26 25 21 20 16 15 11 10 
oe aie es | 
eee 00000 an uaa 
C790 
Format: PCPYH rd, rt 
Purpose: To copy halfword. 
Description: rd <— copy (rt) 


The contents of the low-order halfword of the two doublewords in GPR rt are copied to 
each of the halfwords of the two doublewords in GPR ra. 


This instruction operates on 128-bit registers. 


Operation: 


GPRIrd]is.o << GPR[rt]is.0 
GPRIrd]s1.16 < GPR[rt]1s..0 
GPR[rd]a7.32 < GPRI[rt]is..0 
GPRIrd]e3.48 < GPR[rt]is..o 
GPR[rd]v9.64 < GPR[rt]79.64 
GPRIrd]os.s0 < GPR[rt]79..64 
GPR[IrdJii.96 < GPR[rt]79..64 
GPR[rd]127.112 <_ GPR[rt]79..64 


127 80 79 64 63 16 15 0 


Exceptions: 


None 
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ada Parallel Copy Lower Doubleword pile 


26 25 21 20 16 15 11 10 
ote Te Ts P| 
mee wu mile 
C790 
Format: PCPYLD rd, rs, rt 
Purpose: To copy doubleword. 
Description: rd < copy (rs, rt) 


The contents of the low-order doubleword in GPR rs are combined with the contents of the 
low-order doubleword in GPR rt. The quadword result is placed into GPR rd. 


This instruction operates on 128-bit registers. 


Operation: 


GPRIrdle3.0 < GPRIrt]e3..o 
GPRIrd]i27.64 < GPRIrsle3..o 


Exceptions: 


None 


B-64 


TX 
TOSHIBA Appendix B C790-Specific Instruction Set Details We 


ada D Parallel Copy Upper Doubleword Pee iss D 
26 25 21 20 16 15 11 10 
MMI PCPYUD MMI3 
saa —_ uaa 
C790 
Format: PCPYUD rd, rs, rt 
Purpose: To copy doubleword. 
Description: rd < copy (rs, rt) 


The contents of the high-order doubleword in GPR rs are combined with the contents of 
the high-order doubleword in GPR rt. The quadword result is placed into GPR rd. 


This instruction operates on 128-bit registers. 


Operation 


GPRIrdl]e3.0 < GPR[rs]127..64 
GPRIrd]i27.64 < GPR[rt]127..64 


Exceptions: 


None 
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Format: PDIVBW rs, rt 
Purpose: To divide 4 32-bit signed integers by a 16-bit signed integer in parallel. 
Description: (LO, HI) < rs/rt 


The four signed words in GPR rs are divided by the low-order signed halfword in GPR rt, 
in parallel. The four 32-bit quotients are placed into special register LO. The four 16-bit 
remainders are placed into special register H/. 


No arithmetic exception occurs under any circumstances. 


This instruction operates on 128-bit registers. 


Restrictions: 


If the divisor in GPR rt is zero, the arithmetic result value is undefined. 


Operation: 


gO < GPRIrs]s1..0 div GPR[rt]is..0 

rO << GPRIrs]s1..0 mod GPR[rt]1s..0 
ql < GPRIrs]e3..32 div GPR[rt]is..0 
rl < GPRI[rsle3.32 mod GPR[rt]s..0 
q2 < GPRIrsjJos..64 div GPR[rt]1s..0 
r2. < GPR[rs]o5.64 mod GPR[rt]s..0 
qg3. < GPRIrshi27.96 div GPR[rt]15..0 
r3.  < GPR[rs]127.96 mod GPR[rt]1s..0 


LO31..0 < q031..0 
HI 31..0 < (r015)1€ || rO15..0 
L 063.32 © qlsi.0 
Hl 63.32 © (r1is)?¢ || r115..0 
L Oos..64 © q231..0 
Hl 95.64 © (r215)/€ || r215..0 
L O127..96 © 331.0 
H1 127.96 © (r315)!€ || r315..0 
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127 96 95 64 63 32 31 0 
127 : WW 0 


oO 


sign ext (A3 mod BO) sign ext (A2 mod BO) sign ext (A1 mod BO) sign ext ( AO mod BO) 
127 96 95 64 63 32 31 0 


AS div BO A2 div BO Ai div BO AO div BO 


Supplementary explanation: 


When 0x80000000 (-2147483648), the most negative value, is divided by OxFFFF (-1), the 
operation will results in an overflow. However, overflow exception doesn’t occur and the 
operation results in the following: 


Quotient is Ox80000000 (-2147483648), and remainder is OxOO000000 (0). 


Exceptions: 


None 


Programming Notes: 


In the C790 the integer divide operation proceeds asynchronously and allows other CPU 
instructions to execute before it is retired. An attempt to read LO or H/ before the results 
are written will cause an interlock until the results are ready. Asynchronous execution 
does not affect the program result, but offers an opportunity for performance improvement 
by scheduling the divide so that other instructions can execute in parallel. 


No arithmetic exception occurs under any circumstances. If divide-by-zero or overflow 
conditions should be detected and some action taken, then the divide instruction is 
typically followed by additional instructions to check for a Zero divisor and / or for overflow. 
If the divide is asynchronous then the zero-divisor check can execute in parallel with the 
divide. The action taken on either divide-by-zero or overflow is either a convention within 
the program itself or more typically, the system software; one possibility is to take a 
BREAK exception with a code field value to signal the problem to the system software. 


As an example, the C programming language in a UNIX environment expects division by 
zero to either terminate the program or execute a program-specified signal handler. C 
does not expect overflow to cause any exceptional condition. If the C compiler uses a divide 
instruction, it also emits code to test for a zero divisor and execute a BREAK instruction to 
inform the operating system if one is detected. 
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C790 
Format: PDIVUW rs, rt 
Purpose: To divide 2 pairs of 32-bit unsigned integers in parallel. 
Description: (LO, HI) < rs/rt 


The low-order unsigned word of the two doublewords in GPR rs are divided by the low- 
order unsigned word of the two doublewords in GPR rt in parallel. The two 32 bit 
quotients are placed into special register LO. The two 32-bit remainders are placed into 
special register H/. 


No arithmetic exception occurs under any circumstances. 


This instruction operates on 128-bit registers. 


Restrictions: 


If neither GPR rt nor GPR rs _ contain a zero-extended 32-bit value (bits 127..96 and 
63..32 equal zero), the result of the operation will be undefined. 


If the divisor in GPR rt is zero, the result will be undefined. 
Operation: 


if (NotWordValue(GPRIrs]) or NotWordValue(GPR[rt])) then UndefinedResult() endif 

(ol0) < (0|| GPR[rs]s1..0) div (0 || GPR[rt]s1..0) 

rO < (0|| GPR[rs]31..0) mod (0 || GPR[rt]s1..0) 
< (0]|| GPR[rs]os..64) div (0 || GPR[rt]os..64) 
< (0]| GPR[rs]gs..64) mod (0 || GPR[rt]os..64) 
LOe3.0 < (qO 31) || q031..0 
Hle3.0 << (r0 31)22 || rO31..0 
LO127..64 < (q1 31)? || q13i..0 
Hl 127.64 < (r1 31)*2 || r131..0 


127 96 95 64 63 32 31 0 
Se ee 2 a 
427 96 95 y 64 63 85 44 - 0 
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Exceptions: 


None 


Programming Notes: 


See the Programming Notes for the PDIVBW instruction. 
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C790 
Format: PDIVW rs, rt 
Purpose: To divide 2 pairs of 32-bit signed integers in parallel. 
Description: (LO, Hl) < rs/rt 


The low-order signed word of the two doublewords in GPR rs are divided by the low-order 
signed word of the two doublewords in GPR rt in parallel. The two 32 bit quotients are 
placed into special register LO. The two 32-bit remainders are placed into special register 
HI. 


No arithmetic exception occurs under any circumstances. 
This instruction operates on 128-bit registers. 
Restrictions: 


If neither GPR rt nor GPR rs contain a sign-extended 32-bit value (bits 127..95 equal and 
63..31 equal), the result of the operation will be undefined. 
If the divisor in GPR rtis zero, the result will be undefined. 


Operation: 


if (NotWordValue (GPR[Irs]) or NotWordValue (GPR[rt])) then UndefinedResult() endif 
qo < GPRIrs]s1.0 div GPR[rt]s1..0 


ae) < GPRIrs]31..0 mod GPR[rt]s1..0 
ql < GPRIrs]Jos..64 div GPR[Irt]os..64 
rl < GPRIrslos..64 mod GPRIrt]os..64 


LOe3.0 < (qO 31) || q031..0 
Hle3.0 < (r031)22 || rO31..0 
LO127..64 < (q1 31)? || q13i..0 
HI 127.64 < (r1 31)*2 || 7131.0 


127 96 95 64 63 32 31 0 
a 
127 96 95 = 64 63 32 31 ie 0 
2 a 
127 96 95 64 63 32 31 0 
127 96 95 64 63 32 31 0 
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When 0x80000000 (-2147483648), the most negative value, is divided by OxFFFFFFFF (-1), 
the operation results in an overflow. However, overflow exception doesn’t occur; the 


operation results in the followings: 
Quotient (q) is Ox80000000 (-2147483648), and remainder (r) is OxO0000000(0). 
Exceptions: 


None 


Programming Notes: 


See the Programming Notes for the PDIVBW instruction. 
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Format: PEXCH rd, rt 
Purpose: To exchange halfwords. 
Description: rd — exchange (rt) 


The two central halfwords of the high-order doubleword in GPR rt are exchanged and the 
two central halfwords of the low-order doubleword in GPR rt are exchanged. The results 
are copied to GPR rd while other halfwords are copied directly to the corresponding 
halfwords. 


This instruction operates on 128-bit registers. 


Operation: 


GPRIrdlis.o << GPR[rt]1s..0 
GPRIrd]s1.16 < GPR[rt]47..32 
GPR[rd]a7.32 < GPR[rt]s1.16 
GPRIrd]e3.48 < GPR[rt]e3..48 
GPR[rd]79.64 < GPR[rt]79..64 
GPRIrd]os.so0 < GPR[rt]111..96 
GPRIrd]i11.96 < GPR[rt]os..80 
GPR[rd]i27.112 < GPR[rt]127..112 


112 111 ve ae 80 eral 64 al 48 47 al a 16 15 


fear oa 48 EF ear 
ane | 


Exceptions: 


None 
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Format: PEXCW rd, rt 
Purpose: To exchange words. 
Description: rd — exchange (rt) 


The two central words in GPR rt are exchanged. The results are copied to GPR rd while 
other words are copied directly to the corresponding words. 


This instruction operates on 128-bit registers. 


Operation: 
GPRIrd]s1.0 < GPR[rt]s1..o0 
GPRIrd]e3.32 < GPR[rt]os..64 
GPRIrd]os.64 < GPRIrt]e3..32 
GPRIrd]i27.96 < GPR[rt]127..96 


Exceptions: 


None 
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Format: PEXEH rd, rt 
Purpose: To exchange halfwords. 
Description: rd — exchange (rt) 


The two low-order halfwords of the two words of the high-order doubleword in GPR rt are 
exchanged and the two low-order halfwords of the two words of the low-order doubleword 
in GPR rt are exchanged. The results are copied to GPR rd while other halfwords are 
copied directly to the corresponding halfwords. 


This instruction operates on 128-bit registers. 


Operation: 


GPRIrd]is.o << GPR[rt]a7.32 
GPRIrd]s1.16 < GPR[rt]s1..16 
GPR[rd]a7.32 << GPR[rt]is..o 
GPRIrd]e3.48 < GPR[rtl]e3.48 
GPR[rd]79.64 < GPR[rt]111..96 
GPRIrd]os.s0 < GPR[rt]os..80 
GPRIrd]ii1.96 < GPR[rt]79..64 
GPR[rd]127.112 <— GPR[rt]127..112 


127 112 111 96 95 80 79 64 63 48 47 32.31 16 15 0 


Exceptions: 


None 
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Format: PEXEW rd, rt 
Purpose: To exchange word. 
Description: rd — exchange (rt) 


The two low-order words of the two doublewords in GPR rt are exchanged. The results are 
copied to GPR rd while other words are copied directly to the corresponding words. 


This instruction operates on 128-bit registers. 


Operation: 
GPR[rd]31.0 < GPR[rt]os..64 
GPRIrd]e3.32 < GPR[rt]es..32 
GPRIrd]os.64 < GPR[rt]s1..0 
GPRIrd]i27.96 < GPR[rt]127..96 


Exceptions: 


None 
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Format: PEXT5 rd, rt 
Purpose: To extend bytes from 5-bits. 
Description: rd + extend (rt) 


The four low-order 16-bits (1, 5, 5, 5 bit) of the four words in GPR rt are extended to four 
32-bits (8, 8, 8, 8 bit). The quadword result is placed into GPR rad. 


This instruction operates on 128-bit registers. 


Operation 


GPR[rd]j20 <0 
GPR[rd]:.3 < GPR[rt]4..0 
GPR[rd]io.8 < 03 
GPR[rd]is.11 < GPR[rt]o.s5 
GPR[rd]is.16 < 03 
GPR[Ird]z3.19 < GPR[rt]1a.10 
GPR[rd]30.24 < 07 
GPR[rd]s1 < GPR[rt]is 
GPR[rd]34.32 < 03 
GPRIrd]s9.35 < GPR[rt]36.32 
GPR[rd]a2.40 < 03 
GPR[rd]a7.43 < GPR[rt]a1..37 
GPR[rd]s0.48 < 03 
GPR[Ird]ss.51 < GPR[rt]4e..42 
GPRIrd]ez.56 < 07 
GPRIrdle3 < GPR[rt]a7 
GPR[rdle6.64 < 03 
GPRIrd]v1.67 <— GPRIrt]es..64 
GPR[rd]v4.72 < 03 
GPR[rd]79.75 < GPR[rt]73..69 
GPR[rd]ez.80 < 03 
GPRIrd]s7.83 < GPR[rt]vs..74 
GPR[rd]oa.s8 < 07 
GPRIrd]os < GPR[rt]79 
GPR[rd]os.96 < 03 
GPRIrd]Jio3.99 < GPR[rt]1o00..96 
GPR[rd)]10o6..104 < 03 
GPR[rd]111..107 <_ GPR[rt]10s..101 
GPR[rd]i14..112 < 03 
GPR[rd]119.115 <- GPR[rt]110..106 
GPR[rd]iz6.120 < 07 
GPR[rd]127 < GPR[rt]i1 
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[Overview] 
127 1211 96 95 80 79 64 63 48 47 32 31 16 15 0 


rt 


Exceptions: 


None 
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Format: PEXTLB rd, rs, rt 
Purpose: To extend halfwords from bytes. 
Description: rd < extend (rs, rt) 


The contents of the low-order doubleword in GPR rs are combined with the contents of the 
low-order doubleword in GPR rt in a byte wide Interleaved operation. The quadword 
result is placed into GPR rd. 


This instruction operates on 128-bit registers. 


Operation 


GPR[rd]7.0 << GPR[rt]7.0 
GPR[rd]is.8 <— GPRIrs]z..o 
GPR[rd]23.16 < GPR[rt]1s..s 
GPRIrd]31.24 < GPR[rs]is..s 
GPR[rd]39.32 < GPR[rt]23..16 
GPR[rd]a7.40 < GPR[rs]23..16 
GPR[rd]ss.48 < GPR[rt]s1..24 
GPRIrd]e3.56 << GPR[rs]s1..24 
GPR[rd]71.64 <— GPR[rt]s9..32 
GPR[rd]79.72 < GPR[rs]s9..32 
GPRIrd]s7.s0 < GPR[rt]a7..40 
GPRIrd]os.s8 << GPR[rs]a7..40 
GPR[rd]io3.96 < GPR[rt]ss..48 
GPR[rd]111.104 <— GPRIrs]s5..48 
GPR[rd]119.112 — GPR[rt]e3..56 
GPR[rd]127..120 < GPR[rs]es..56 


12 64 63 56 55 48 47 40 39 32 31 24 23 1615 87 


[fe fas fo [os Pe 


far for [se] cof 5] os] | oe] mo] [| ee] ae [ mle 
Ca 


Exceptions: 


None 
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Format: PEXTLH rd, rs, rt 
Purpose: To extend words from halfwords. 
Description: rd < extend (rs, rt) 


The contents of the low-order doubleword in GPR rs are combined with the contents of the 
low-order doubleword in GPR rt in a halfword wide Interleaved operation. The quadword 
result is placed into GPR rd. 


This instruction operates on 128-bit registers. 


Operation 


GPRIrd]is.o << GPR[rt]1s..0 

GPRIrd]s1.16 < GPR[rs]as..0 
GPR[rd]a7.32 < GPR[rt]s1.16 
GPRIrd]e3.48 < GPR[rs]31..16 
GPRIrd]79.64 <— GPR[rt]47..32 
GPRIrd]os.8s0 < GPR[rs]a7..32 
GPR[rd]i11.96 < GPR[rt]es3..48 
GPR[rd]127..112 <— GPR[rs]es..48 


127 64 63 48 47 32_31 16 15 0 


Exceptions: 


None 
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Format: PEXTLW td, rs, rt 
Purpose: To extend doublewords from words. 
Description: rd < extend (rs, rt) 


The contents of the low-order doubleword in GPR rs are combined with the contents of the 
low-order doubleword in GPR rt in a word wide Interleaved operation. The quadword 
result is placed into GPR rd. 


This instruction operates on 128-bit registers. 


Operation: 


GPRIrd]s1.0 < GPR[rt]s1.0 
GPRIrd]e3.32 < GPR[rs]31..o 
GPRIrd]os.64 < GPR[rt]es..32 
GPRIrd]i27.96 — GPR[rsl]e3..32 


Exceptions: 


None 
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Format: PEXTUB rtd, rs, rt 
Purpose: To extend halfwords from bytes. 
Description: rd < extend (rs, rt) 


The contents of the high-order doubleword in GPR rs are combined with the contents of 
the high-order doubleword in GPR rt in a byte wide Interleaved operation. The quadword 
result is placed into GPR rd. 


This instruction operates on 128-bit registers. 


Operation: 


GPRIrd]z..o < GPR[rt]1..64 
GPR[rd]is.8 << GPR[rs]71..64 
GPR[rd]23.16 < GPR[rt]79..72 
GPR[Ird]s1.24 < GPR[rs]79..72 
GPR[rd]39.32 < GPR[rt]s7..s80 
GPR[rd]a7.40 < GPR[rs]s7..so 
GPRIrd]ss.48 < GPR[rt]os..ss 
GPRIrd]e3.56 < GPR[rs]os..ss 
GPR[rd]71.64 <— GPR[rt]103..96 
GPR[rd]79.72 << GPR[rs]103..96 
GPR[rd]s7.s0 < GPR[rt]111..104 
GPR[rd]os.s8 << GPR[rs]111..104 
GPR[rdJio3.96 < GPR[rt]119..112 
GPR[rd]111.104 <— GPR[rs]119..112 
GPR[rd]i19..112 < GPR[rt]127..120 
GPR[rd]127..120 < GPR[rs]127..120 


127 120119 112111 104103 9695 8887 80 79 72 71 64 63 


sla [s_s[s[s]e[afo] 


104 103 96 1 


olor fer [mo] 0] | 5] |] [=m [el else lela 
«(or feofes fetes fse[erfm] 


Exceptions: 


None 
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Format: PEXTUH rd, rs, rt 
Purpose: To extend words from halfwords. 
Description: rd < extend (rs, rt) 


The contents of the high-order doubleword in GPR rs are combined with the contents of 
the high-order doubleword in GPR rt in a halfword wide Interleaved operation. The 
quadword result is placed into GPR rd. 


This instruction operates on 128-bit registers. 
Operation: 


GPRIrd]is.o <— GPR[rt]79..64 
GPR[rd]s1.16 < GPR[rs]79..64 
GPR[rd]a7.32 < GPR[rt]os..80 
GPRIrd]e3.48 < GPR[rs]os..80 
GPR[rd]79.64 <— GPR[rt]111..96 
GPRIrd]gs.s0 < GPR[rs]111..96 
GPR[rd]i11.96 < GPR[rt]127..112 
GPR[rd]127.112 <— GPR[rs]127..112 


127 112 111 96 95 80 79 64 63 0 


Exceptions: 


None 
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Format: PEXTUW 1d, rs, rt 
Purpose: To extend doublewords from words. 
Description: rd < extend (rs, rt) 


The contents of the high-order doubleword in GPR rs are combined with the contents of 
the high-order doubleword in GPR rt in a word wide Interleaved operation. The quadword 
result is placed into GPR rd. 


This instruction operates on 128-bit registers. 


Operation: 


GPRIrd]s1.0 < GPR[rt]os..64 
GPRIrd]e3.32 < GPR[rsl]os..64 
GPRIrd]os.64 < GPR[rt]127..96 
GPR[rd]127.96 < GPR[rs]127..96 


Exceptions: 


None 
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Format: PHMADH rd, rs, rt 
Purpose: To multiply 8 pairs of 16-bit signed integers and horizontally add. 
Description: (rd, HI, LO) < rs x rt+rsxrt 


The eight signed halfwords in GPR rs are multiplied by the eight signed halfwords in GPR 
rt in parallel. The four word multiply results are added to the other four word multiply 
results, and the four word results are placed into the corresponding words in special 
registers H/, LO and GPR rd. 


No arithmetic exception occurs under any circumstances. 


This instruction operates on 128-bit registers. 


Restrictions: 


None 

Operation: 
prodO < GPRIrs]Js1..16 x GPR[rt]s1..16 + GPR[rs]15..0 x GPR[rt]15..0 
prod1 < GPRIrs]e3..48 x GPR[rt]e3.48 + GPR[rs]az..32 x GPR[rt]az..32 
prod2 < GPRIrs]Jos..20 x GPR[rt]9s..80 + GPR[rs]79..64 x GPR[rt]79..64 
prod3 < GPRIrs]127..112 x GPR[rt]127..112 + GPR[rs]111..96 x GPR[rt]111..96 
LO 31.0 < prod03:1..0 
LO 63.32 < Undefined 
HI 31..0 < prod13:1..0 
HI 63.32 < Undefined 


LO 95.64 < prod231..0 
LO 127..96 < Undefined 
HI 95..64 < prod331..0 
HI 127.96 < Undefined 
GPR[rd]s1.0 < prod0s:z1..0 
GPR[rd]e3.32 < prod131..0 
GPR[rd]os.64 < prod231..0 
GPR[rd]127..96 < prod331..0 
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127 “112 111% 9695 ~ 8079 ~ 6463 *~ 4847 ~ 3231 “16 15 0 


rt 


127 96 95 64 63 382 31 0 
rd A7xB7 + A6xB6 A5xB5 + A4xB4 A3xB3 + A2xB2 A1xB1 + A0xBO 

127 96 95 64 63 32 31 0 
HI Undefined A7xB7 + A6xB6 Undefined A3xB3 + A2xB2 


27 96 95 64 63 32 31 0 


LO Undefined A5xB5 + A4xB4 Undefined A1xB1 + AOxBO 


— 


Exceptions: 
None 
Programming Notes: 


In the C790, the integer multiply operation allows other CPU instructions to execute out- 
of-order. An attempt to read LO or H/ registers before the results are written will cause 
an interlock until the results are ready. Asynchronous execution does not affect the 
program result, but offers an opportunity for performance improvement by scheduling the 
multiply so that other instructions can execute in parallel. 


Programs that require overflow detection must check for it explicitly. 
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Format: PHMSBH rd, rs, rt 
Purpose: To multiply 8 pairs of 16-bit signed integers and horizontally subtract. 
Description: (rd, HI, LO) < rs x rt—rs x rt 


The eight signed halfwords in GPR rs are multiplied by the eight signed halfwords in GPR 
rt in parallel. The four word multiply results are subtracted from the other four word 
multiply results, and the four word results are placed into the corresponding words in 
special registers H/, LO and GPR rd. 


No arithmetic exception occurs under any circumstances. 


This instruction operates on 128-bit registers. 


Restrictions: 


None 

Operation: 
prodO < GPRIrs]s1..16 x GPR[rt]31..16 - GPR[rs]15..0 x GPR[rt]15..0 
prod1 < GPRIrs]e3..48 x GPR[rt]e3.48 -— GPR[rs]a7..32 x GPR[rt]az..32 
prod2 < GPRIrs]Jos..20 x GPR[rt]9s.80 - GPR[rs]79..64 x GPR[rt]79..64 
prod3 < GPRIrs]127..112 x GPR[rt]127..112 - GPR[rs]111..96 x GPR[rt]111..96 
LO 31.0 < prod03:1..0 
LO 63.32 < Undefined 
HI 31..0 < prod13:1..0 
HI 63.32 < Undefined 


LO 95.64 < prod231..0 
LO 127..96 < Undefined 
HI 95..64 < prod331..0 
HI 127.96 < Undefined 
GPR[rd]s1.0 < prod0s:z1..0 
GPR[rd]e3.32 < prod131..0 
GPR[rd]os.64 < prod231..0 
GPR[rd]127..96 < prod331..0 
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127. —_—«112 «111 96 95 80 79 64 63 48 47 32.31 16 15 0) 
: 


127 “112 111% 9695 ~ 8079 ~ 6463 *~ 4847 ~ 3231 “16 15 0 
rt 


127 96 95 64 63 32 31 0 
rd A7xB7 — A6xB6 A5xB5 — A4xB4 A8xB3 — A2xB2 A1xB1 — AOxBO 

127 96 95 64 63 32 31 0 
HI Undefined A7xB7 — A6xB6 Undefined A3xB3 — A2xB2 


27 96 95 64 63 32 31 0 


LO Undefined A5xB5 — A4xB4 Undefined A1xB1 — AOxBO 


— 


Exceptions: 
None 
Programming Notes: 


In the C790, the integer multiply operation allows other CPU instructions to execute out- 
of-order. An attempt to read LO or H/ registers before the results are written will wait 
(interlock) until the results are ready. Asynchronous execution does not affect the program 
result, but offers an opportunity for performance improvement by scheduling the multiply 
so that other instructions can execute in parallel. 


Programs that require overflow detection must check for it explicitly. 
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aldol Parallel Interleave Even Halfword i NTEH 


26 25 21 20 1615 1110 
fom [= | | = [Sm] 
011100 01010 101001 
C790 
Format: PINTEH rd, rs, rt 
Purpose: To combine halfwords in a halfword wide interleaved operation. 
Description: rd < interleave (rs, rt) 


The low-order halfword of the four words in GPR rs are combined with the low-order 
halfword of the four words in GPR rt in a halfword wide Interleaved operation. The 
quadword result is placed into GPR rd. 


This instruction operates on 128-bit registers. 


Operation: 


GPRIrd]is.o << GPR[rt]1s..0 
GPRIrd]s1.16 < GPR[rs]1s..o 
GPR[rd]a7.32 <— GPR[rt]47..32 
GPRIrd]e3.48 < GPR[rs]a7..32 
GPR[rd]79.64 <— GPR[rt]79..64 
GPRIrd]gs.so < GPR[rs]79..64 
GPR[rd]i11.96 < GPR[rt]111..96 
GPR[rd]127..112 <_ GPR[rs]111..96 


127 112 111 96 95 80 79 64 63 48 47 32 31 16 15 0 


64 63 48 47 32. 31 


127 112 111 64 63 48 47 
es 


Exceptions: 


None 
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ea Parallel Interleave Halfword Payne! 


26 25 21 20 16 15 11 10 
FAGSreesede 
wee ae ead 
C790 
Format: PINTH rd, rs, rt 
Purpose: To combine doublewords in a halfword wide interleaved operation. 
Description: rd < interleave (rs, rt) 


The contents of the high-order doubleword in GPR rs are combined with the contents of 
the low-order doubleword in GPR rt in a halfword wide Interleaved operation. The 
quadword result is placed into GPR rd. 


This instruction operates on 128-bit registers. 


Operation: 


GPRIrd]is.o << GPR[rt]1s..o 
GPR[rd]s1.16 < GPR[rs]79..64 
GPR[rd]a7.32 <— GPR[rt]s1.16 
GPRIrd]e3.48 < GPR[rs]os..80 
GPRIrd]79.64 <— GPR[rt]47..32 
GPR[rd]os.so < GPR[rs]111..96 
GPR[rd]i11.96 < GPR[rt]es3..48 
GPR[rd]127.112 <_ GPR[rs]127..112 


127 112 111 96 95 80_79 64 63 0 


Exceptions: 


None 
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TX 
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amaneate Parallel Leading Zero or one Count Word poe 


26 25 21 20 16 15 11 10 
el © [otal Doe ee 
011 : 00 00000 00000 a 00 
C790 
Format: PLZCW rd, rs 
Purpose: To count leading zero (s) or one (s) (2 parallel operations). 
Description: rd < LZC (rs) - 1 


The number of leading zeros or ones of the two words in GPR rs are counted. The results 
of the leading counts minus one are loaded in the corresponding words in GPR rd. 
Operation: 


GPR[rd]31..0 <- Leading zero or one count (GPR[rs]s1..0) — 1 
GPR[rd]e3.32 <- Leading zero or one count (GPR[rs]e3..32) — 1 


63 32 31 0 


Example: 


63 32 31 0) 
OxOOOFFFFF 


Leading zero Count Leading dne Count 


63 32 31 
0x0000000B 0x00000007 
Exceptions: 
None 
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a ADDH Parallel Multiply-Add Halfword as ADDH 


26 25 21 20 16 15 11 10 
FAGSQeescoes 
saa ia ead 
C790 
Format: PMADDH rd, rs, rt 
Purpose: To multiply 8 pairs of 16-bit signed integers and accumulate, in parallel. 
Description: (rd, Hl, LO) <— (HI, LO) + rs x rt 


The eight signed halfwords in GPR rs are multiplied by the eight signed halfwords in GPR 
rt in parallel. The eight word multiply results are added to the corresponding words in 
special registers H/ and LO, and the word results are placed into the corresponding words 
in special registers H/, LO and GPR rd. 


No arithmetic exception occurs under any circumstances. 


This instruction operates on 128-bit registers. 


Restrictions: 


None 

Operation: 
prodO < LO 31.0 +GPR[rs]15.0 x GPR[rt]15..0 
prod1 < LO 63.32 +GPR[rs]31.16 x GPR[rt]s1..16 
prod2 < HI 31.0 +GPR[rs]a7.32x GPR[rt]47.32 
prod3 <— HI 63.32 +GPR[rsle3..48 x GPR[rt]e3..48 
prod4. < LO 95.64 +GPR[rs]79..64 x GPR[rt]79..64 
prod5 < LO 127.96 +GPR[rs]os..20 x GPR[rt]g5..80 
prod6 © HI 95.64 +GPR[rs]Ji11.96 x GPR[rt]111..96 
prod7 < HI 127.96 +GPR[rs]127..112 xX GPR[rt]127..112 
LO 31.0 < prod03:1..0 
LO 63.32 < prod13:1..0 
HI 31.0 < prod231..0 
HI 63.32 © prod331..0 


LO 95.64 < prod43z1..0 
LO 127..96 < prod5:1..0 
HI 95..64 < prod6:1..0 
HI 127.96 < prod731..0 
GPR[rd]s1.0 < prod0s:z1..0 
GPR[rd]e3.32 < prod231..0 
GPR[rd]os.64 < prod43z1..0 
GPR[rd]127..96 < prod6:1..0 
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127. 112 111. 9695 8079 +6463 ~=— 48 47,— 3231s 15 0 
' 
127 “112 111 ~ 9695 “8079 ~6463 “4847 *~ 3231 ~1615 *~ oO 
a | er | se | 5 | o4 | 63 | 52 | Bi | 80 | 
127 96 95 64 63 32 31 0 
' 
127 96 95 64 63 32 31 0 
: 
127 96 95 64 63 32 31 0 
rd AG x B6 + C6 A4xB4+C4 A2 x B2 +C2 AO x BO + CO 
127 96 95 64 63 32 31 0 
HI A7 x B7 + C7 AG x B6 + C6 A3 x B3 + C3 A2 x B2 + C2 
127 96 95 64 63 32 31 0 
LO A5 x B5 + C5 A4xB4+C4 A1xB1+Ct AO x BO + CO 
Exceptions: 
None 


Programming Notes: 


In the C790, the integer multiply operation allow other CPU instructions to execute out- 
of-order. An attempt to read LO or H/ registers before the results are written will cause 
an interlock until the results are ready. Asynchronous execution does not affect the 
program result, but offers an opportunity for performance improvement by scheduling the 
multiply so that other instructions can execute in parallel. 


Programs that require overflow detection must check for it explicitly. 
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a ADDUW Parallel Multiply-Add Unsigned Word PM eee 


26 25 21 20 16 15 11 10 
EARSESEs co 
saa 00000 a 
C790 
Format: PMADDUW 1d, rs, rt 
Purpose: To multiply 2 pairs of 32-bit unsigned integers and accumulate in parallel. 
Description: (rd, HI, LO) < (HI, LO) + rs x rt 


The low-order unsigned word of the two doublewords in GPR rs are multiplied by the low- 
order unsigned word of the two doublewords in GPR rtin parallel. The two 64-bit multiply 
results are added to the contents of special registers H/ and LO. The low-order word of the 
two doubleword results are placed into special register LO, and the high-order word of the 
two doubleword results are placed into special register H/. The two doubleword results are 
placed into GPR rd. 


No arithmetic exception occurs under any circumstances. 


This instruction operates on 128-bit registers. 


Restrictions: 


If either GPR rt or GPR rs do not contain zero-extended 32-bit values (bits 127..96 and 
63..32 equal zero) then the result of the equation will be undefined. 


Operation: 
if (NotWordValue (GPRIrs]) or NotWordValue (GPR[rt])) then UndefinedResult() endif 
prod0O < (H1 31.0 || LO31..0) + (0 || GPR[rs]sz..0) x (0 || GPR[irt]sz..0) 
prod1 < (Hl95.64 || LOos..64) + (0 |] GPR[rslos..64) x (0 |] GPR[rt]os..64) 
L O63..0 <(prod0 31) || prodQs1..0 
H163..0 <(prod0 63)? || prod0es..32 
L.O127..64 <(prod1 31) || prod131..0 


H1 127.64 <(prod1 63) || prodles.32 
GPRIrd]e3.0 <prod06.3..0 
GPR[rd]127..64 <-prod1eés..o 
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127 96 95 64 63 32 31 0 
x x 

127 96 95 64 63 32 31 0 
2 a 

127 96 95 64 63 32 31 0 
: 

127 96 95 64 63 32 31 0 
0 

127 64 63 0 
rd (0 || A2) x (0 || B2) + (C6 || C4) (0 || AO) x (0 || BO) + (C2 || CO) 

127 96 95 64 63 32 31 0 
HI ((0 || A2) x (0 || B2) + (C6 || C4))e3..32 ((0 || AO) x (0 || BO) + (C2 || CO))gs. 32 

127 96 95 64 63 32 31 0 


Exceptions: 


None 


Programming Notes: 


See the Programming Notes for the PMADDH instruction. 


a 


94 


TX 
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db ADDW Parallel Multiply-Add Word PM faa DW 


26 25 21 20 16 15 11 10 
cote Te Ts Pa a 
011 cia ie ae 001 
C790 
Format: PMADDW 1d, rs, rt 
Purpose: To multiply 2 pairs of 32-bit signed integers and accumulate in parallel. 
Description: (rd, HI, LO) < (HI, LO) + rs x rt 


The low-order signed word of the two doublewords in GPR rs are multiplied by the low- 
order signed word of the two doublewords in GPR rt in parallel. The two 64-bit multiply 
results are added to the contents of special registers H/ and LO. The low-order word of the 
two doubleword results are placed into special register LO, and the high-order word of the 
two doubleword results are placed into special register H/. The two doubleword results are 
placed into GPR rd. 


No arithmetic exception occurs under any circumstances. 


This instruction operates on 128-bit registers. 


Restrictions: 


If either GPR rt or GPR rs do not contain sign-extended 32-bit values (bits 127..95 and 
63..31 equal) then the result of the equation will be undefined. 


Operation: 
if (NotWordValue (GPRIrs]) or NotWordValue (GPR[rt])) then UndefinedResult() endif 
prodO < (H1I31..0 || LO31.0) + GPR[rs]31..0 x GPR[rt]s1..0 
prod1 < (Hlo95..64 || LOos..64) + GPR[rs]os..64 x GPR[rt]os..64 
L O63..0 < (prod0 31)? || prod0s3z..0 
Hl63..0 < (prod0 63)? || prod0e3..32 
L.O127..64 < (prod1 31)? || prod1s1..0 


H1 127.64 © (prod1 63)? || prodles.32 
GPRIrd]e3.0 < prod06e..o 
GPR[rd]127..64 < prod1es..o 
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127 96 95 64 63 32 31 0 
x 

127 96 95 64 63 32 31 0 
‘(= [= [= | = 

127 96 95 64 63 32 31 0 
: 

127 96 95 64 63 32 31 0 
0 

127 64 63 0 
rd A2 x B2 + (C6 || C4) AO x BO + (C2 || CO) 

127 96 95 64 63 32 31 0 

127 96 95 64 63 32 31 0 


Exceptions: 


None 


Programming Notes: 


See the Programming Notes for the PMADDH instruction. 


va 
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TX 
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anes Parallel Maximum Halfword ial 


26 25 21 20 1615 1110 
tee Te T= [ee] 
011100 ee 001000 
C790 
Format: PMAXH rd, rs, rt 
Purpose: To select maximum 16-bit signed integers (8 parallel operations). 
Description: rd <— max (rs, rt) 


The eight signed halfword values in GPR rt are subtracted from the corresponding eight 
signed halfword values in GPR rs in parallel. If the result of subtraction is larger than 
zero, the corresponding signed halfword value in GPR rs is placed into the corresponding 
halfword in GPR rd otherwise the corresponding signed halfword value in GPR rtis placed 
into the corresponding halfword of the GPR rd. 


This instruction operates on 128-bit registers. 


Operation: 


if (GPR[rs]is..0 - GPR[rt]15..0) >0O) then 
GPRIrd]is.0 <— GPRIrs]is..0 

else 
GPRIrd]is.o <— GPRIrt]1s..0 

endif 


if ((GPR[rs]31..16 - GPR[rt]s1..16) >0O) then 
GPRIrd]s1.16 <— GPRIrs]31.16 

else 
GPR[rd]s1.16 <— GPR[rt]s1..16 

endif 


if ((GPR[rs]a7..32 — GPR[rt]47..32) >0) then 
GPRIrd]a7.32 <— GPRIrs]a47..32 

else 
GPRIrd]a7.32 <— GPRIrt]47..32 

endif 


if ((GPR[rsle3..48 — GPR[rt]e3..48) >O) then 
GPRIrdle3.48 <— GPRIrs]e3..48 

else 
GPRIrdle3.48 <— GPRIrt]e3..48 

endif 


if ((GPR[rs]79..64 — GPR[rt]79..64) >0) then 
GPRI[rd]79.64 <— GPRIrs]79..64 

else 
GPRIrd]79.64 <— GPRIrt]79..64 

endif 
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if ((GPR[rs]os..30 — GPR[rt]9s..80) >0) then 
GPRIrd]os.s0 <— GPRIrs]o9s..80 

else 
GPRIrd]os.s0 <— GPRIrt]os..80 

endif 


if ((GPR[rs]111.96 — GPR[rt]111..96) >0O) then 
GPR[rd]i11.96 <- GPR[rs]111.96 

else 
GPRIrd]i11.96 <— GPRIrt]111..96 

endif 


if ((GPR[rs]127..112 — GPR[rt]127..112) >0) then 
GPR[rd]127.112 <— GPRIrs]127..112 


else 
GPR[rd]127.112 < GPR[rt]127..112 
endif 
127 112 111 96 95 80 79 64 63 48 47 32 31 16 15 0 
127 112 111 96 95 80 79 64 63 48 47 32 31 16 15 0 
2 
127 112 111 96 95 80 79 64 63 48 47 32 31 16 15 0 


rd | max (A7, B7) | max (A6, B6) | max (A5, B5)_| max (A4, B4) | max (A3, B3) | max (A2, B2) | max (A1, B1) | max (AO, BO) 


Exceptions: 


None 
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TX 
TOSHIBA Appendix B C790-Specific Instruction Set Details We 


ial Parallel Maximum Word ipa 


26 25 21 20 1615 1110 
ate = Te Ts Pee] oa 
011100 ml 001000 
C790 
Format: PMAXW td, rs, rt 
Purpose: To select maximum 32-bit signed integers (4 parallel operations). 
Description: rd + max (rs, rt) 


The four signed word values in GPR rt are subtracted from the corresponding four signed 
word values in GPR rs in parallel. If the result of subtraction is larger than zero, the 
corresponding signed word value in GPR rs is placed into the corresponding word in GPR 
rd otherwise the corresponding signed word value in GPR rt is placed into the 
corresponding word of the GPR rd. 


This instruction operates on 128-bit registers. 


Operation: 


if (GPR[rs]31..0 — GPR[rt]s1..0) >0) then 
GPR[rd]s1..0 <— GPRIrs]s1..0 

else 
GPRIrd]s1.0 <— GPRIrt]31..0 

endif 


if ((GPR[rsle3..32 — GPR[rt]e3..32) >0) then 
GPRIrd]e3.32 <— GPRIrs]e3..32 

else 
GPR[rd]e3.32 <— GPR[rt]e3..32 

endif 


if ((GPR[rs]os..64 — GPR[rt]os..64) >0) then 
GPRIrd]os.64 <— GPRIrs]os..64 

else 
GPRIrd]os.64 <— GPRIrt]os..64 

endif 


if ((GPR[rs]127..06 — GPR[rt]127..96) >0O) then 
GPR[rd]127.96 < GPR[rs]127.96 

else 
GPRIrd]i27.96 <— GPRIrt]127..96 

endif 
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127 96 95 64 63 32 31 0 

127 96 95 64 63 32 31 0 
a a 

127 96 95 64 63 32 31 0 
rd max (A3, B3) max (A2, B2) max (A1, B1) max (AO, BO) 
Exceptions: 

None 
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TX 
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PM FHI Parallel Move From HI Register om FHI 


31 26 25 16 15 11 10 
oon [= [8 | 2 
011 : 00 0000000000 01 = 001 ud 
C790 
Format: PMFHI rd 
Purpose: To copy the special purpose register HI to a GPR. 
Description: rd <— HI 


The contents of special register H/ are loaded into GPR rd. 


This instruction operates on 128-bit registers. 


Restrictions: 


None 


Operation: 
GPR[rd]127..0 <-H1127..0 


Exceptions: 


None 
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PMFHL.fmt Parallel Move From HI / LO Register ne Be L.fmt 


31 26 25 16 15 11 10 
MMI PMFHL 
011 sa 200000000 1 a 
C790 
Format: PMFHL.LW rd (fmt = 0) 


PMFHL.UW rd (fmt = 1) 
PMFHL.SLW rd (fmt = 2) 
PMFHL.LH rd (fmt = 3) 
PMFHL.SH rd (fmt = 4) 


Purpose: To copy the special purpose registers HI / LO to a GPR. 
Description: rd << HI/LO 


The contents of special registers H/ / LO are loaded into GPR rd. 


This instruction operates on 128-bit registers. 


Restrictions: 


None 


Operation: 
if (fmt =O) then 
GPR[rd]31.0 < LOs31.0 
GPR[rd]e3.32 < Hl31.0 
GPR[rd]os.64 <— LOos..64 
GPR[rd]iz7.96 <— Hlos..64 


else if (fmt =1) then 
GPRIrd]31.0 < LO6e3.32 
GPR[rd]e3.32 < Hles..32 
GPRIrd]os.64 < LOu127..96 
GPR[rd]i27..26 < H1127..96 


else if (fmt =2) then 
if (Ox7F FFFFFFFFFFFFFF >=(H13:1.0 || LO31..0) >OxOOOO000007F FF FF FF) then 
GPR[rd]e3..0¢— OxOOO000007F F FF FFF 
else if (Ox8000000000000000 <= (HI 3:1..0 |] LO31..0) <-OxO000000080000000) then 
GPRIrd]e3.0c- OxF FFF FF FF 80000000 
else 
GPRI[rd]e3..0¢— HI 31.0 |] LOsz1..0 
endif 
if ((H195..64 || L.Oo5..64) >OxOO00000007F FF FF FF) then 
GPR[rd]127..64¢— OxO00000007F F FF FFF 
else if ((H195..64 || LOog5..64) <-OxOOOOOO00080000000) then 
GPR[rd]127.. 64<— -OxO000000080000000 
else 
GPR[rd]127.. 64 (LOos)?2 || L Oos..64 
endif 
else if (fmt =3) then 
GPRIrd]is.o< LO1s..0 
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GPR[rd]s1..16¢- LOaz.32 
GPR[rd]a7.32 Hl1s..0 
GPRIrd]e3.4s— H1 47.32 
GPR[rd]79.64— LO79..64 
GPR[rd]Jos.s0¢— LO111.96 
GPR[rd]111.9e¢ H179..64 
GPR [rd]i27..112 H1111..96 


else if (fmt =4) then 

if (Ox7F F FF FF >=LOs1..0 >Ox00007F FF) then 
GPR[rd]15.0¢ Ox7F FF 

else if (Ox80000000< = L O31..0 < OxF FF F 8000) then 
GPR[rd]is..o¢- Ox8000 

else 
GPRIrd]i1s.o¢ LO15..0 

endif 

if (LO63..32 >Ox00007F FF) then 
GPR[rd]31.16¢ Ox7F FF 

else if (L.Oc63..32 <OxF FFF 8000) then 
GPR[rd]31..16<- 0x8000 

else 
GPRIrd]s1.16<— L Oa7..32 

endif 

if (HI 31.0 >OxO0007F FF) then 
GPR[rd]47.32¢ Ox7F FF 

else if (H1 31.0 <OxF FFF 8000) then 
GPR[rd]a7..32¢- 0x8000 

else 
GPR [rd ]a7.32< H115..0 

endif 

if (H1 63.32 >OxO0007F FF ) then 
GPRIrd]e3.4s Ox7F FF 

else if (H 163.32 <OxF FFF 8000) then 
GPR[rd]e3.4s~ 0x8000 

else 
GPRIrd]e3.4se H1 47.32 

endif 

if (LOos5..64 >Ox00007F FF) then 
GPR[rd]79.64 Ox7F F F 

else if (L.Oo5..64 <-OxF F F F 8000) then 
GPR[rd]79.64<- 0x8000 

else 
GPRIrd]79.64 LO79..64 

endif 

if (LO127..96 >Ox00007F FF ) then 
GPRIrd]os..s0¢— Ox7F F F 

else if (L.O127..96 <OxF FF F 8000) then 
GPR[rd]os..so¢- O0x8000 

else 
GPRIrd]gs.. soe LO111..96 

endif 

if (H1 95.64 >Ox00007F FF ) then 
GPR[rd]111.96¢ Ox7F FF 

else if (H195..64 <OxF FFF 8000) then 
GPRIrd]111.9«~ Ox8000 
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else 
GPR[rd]111.96< HI 79..64 

endif 

if (H1 127.96 >Ox00007F FF ) then 
GPR[rd]127.112< Ox7F FF 

else if (H1127.96 <OxF FFF 8000) then 
GPR[rd]127.112 Ox8000 

else 
GPR[rd]127.112< Hl] 111..96 

endif 

endif 


(fmt = 0) 


(fmt = 1) 


127 96 95 64 63 32 31 0) 


(fmt = 2) 


127 96 95 64 63 32.31 0) 


Saturate to Signed Word 
64 63 32 31 
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(fmt = 3) 


127 112 111 96 95 80 79 64 63 48 47 32.31 16 15 0 


127 96 95 64 63 32 31 0 


Exceptions: 


None 
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PM FLO Parallel Move From LO Register io FLO 


31 26 25 16 15 11 10 
oe | ome [= [S| 2 
011 : 00 0000000000 01 = 001 ud 
C790 
Format: PMFLO rd 
Purpose: To copy the special purpose register LO to a GPR. 
Description: rd<LO 


The contents of special register LO are loaded into GPR rd. 
This instruction operates on 128-bit registers. 
Restrictions: 


None 


Operation: 
GPR[rd]127..0 <-LO127..0 


127 64 63 0 


Exceptions: 


None 
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dude Parallel Minimum Halfword ial 


26 25 21 20 1615 1110 
ate f= Te To Tae] 
011100 ml 101000 
C790 
Format: PMINH rd, rs, rt 
Purpose: To select the minimum of two 16-bit signed integers (8 parallel operations). 
Description: rd < min (rs, rt) 


The eight signed halfword values in GPR rt are subtracted from the corresponding eight 
signed halfword values in GPR rs in parallel. If the result of each subtraction is larger 
than zero, the corresponding signed halfword in GPR rt is placed into the corresponding 
halfword in GPR rd otherwise the corresponding signed halfword in GPR rsis placed into 
the corresponding halfword of GPR rd. 


This instruction operates on 128-bit registers. 


Operation: 


if (GPR[rs]1s..0—- GPR[rt]1s..0) >0) then 
GPR[Ird]is.o0 <— GPR[rt]1s..0 

else 
GPRIrd]is.0 <— GPRIrs]is..0 

endif 

if ((GPR[rs]31..16 - GP R[rt]31.16) >0) then 
GPR[rd]s1.16 <— GPR[rt]s1..16 

else 
GPRIrd]31.16 <— GPRIrs]31.16 

endif 

if ((GPR[rs]a7..32 - GP R[rt]a7..32) >0) then 
GPRIrd]a7.32 <— GPRIrt]47..32 

else 
GPRIrd]a7.32 <— GPRIrs]a47..32 

endif 

if (GPR[rsle3..48 — GPR[rt]e3..48) > 0) then 
GPRIrdle3.48 <— GPRIrt]e3..48 

else 
GPRIrdle3.48 <— GPRIrs]e3..48 

endif 

if ((GPR[rs]79..64 — GPR[rt]79..64) >0) then 
GPRIrd]79.64 <— GPRIrt]79..64 

else 
GPRIrd]79.64 <— GPRIrs]79..64 

endif 

if ((GPR[rs]os..30 — GPR[rt]9s..80) >0) then 
GPRIrd]os.s0 <— GPR[rt]os..80 

else 
GPRIrd]ss.s0 <— GPRIrs]o9s..80 

endif 
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if ((GPR[rs]111..06 — GPR[rt]111..96) >0O) then 
GPRIrd]i11.96 <— GPR[rt]111..96 

else 
GPR[rd]i11.96 <- GPR[rs]111..96 

endif 

if ((GPR[rs]127..112 — GPR[rt]127..112) >0) then 
GPR[rd]127.112 <— GPR[rt]127..112 

else 
GPR[rd]127.112 <- GPR[rs]127..112 

endif 


127 112 111 96 95 80 79 64 63 


127 112 111 96 95 80 79 64 63 
2 


127 112 111 96 95 80 79 64 63 


48 47 


—wlelelalale|[a 
pa] 


48 47 


48 47 


min (A7, B7) | min (A6, B6) | min (A5, B5) | min (A4, B4) | min (A3, B3) 


Exceptions: 


None 
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ade Parallel Minimum Word Rd 


26 25 21 20 1615 1110 
ate [= Te Te Te] 
011100 sel 101000 
C790 
Format: PMINW rd, rs, rt 
Purpose: To select the minimum of two 32-bit signed integers (4 parallel operations). 
Description: rd < min (rs, rt) 


The four signed word values in GPR rt are subtracts from the corresponding four signed 
word values in GPR rs, in parallel. If the result of each subtraction is larger than zero, the 
corresponding signed word value in GPR rt is placed into the corresponding word of GPR 
rd otherwise the corresponding signed word value in GPR rs is placed into the 
corresponding word of GPR rd. 


This instruction operates on 128-bit registers. 


Operation: 


if (GPR[rs]31..0 — GPR[rt]s1..0) >0) then 
GPR[rd]s1.0 <— GPR[rt]s1..0 

else 
GPRIrd]s1.0 <— GPRIrs]31..0 

endif 


if ((GPR[rsl]e3..32 — GPR[rt]e3..32) >O) then 
GPRIrd]e3.32 <— GPRIrt]e3..32 

else 
GPRIrd]e3.32 <— GPR[rs]e3.32 

endif 


if ((GPR[rs]os..64 — GPR[rt]os..64) >0) then 
GPRIrd]Jos.64 <— GPR[rt]os..64 

else 
GPRIrd]os.64 <— GPRIrs]os..64 

endif 


if ((GPR[rs]127..06 — GPR[rt]127..96) >0O) then 
GPRIrd]i27.96 <— GPRIrt]127..96 

else 
GPR[rd]127.96 < GPR[rs]127.96 

endif 
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TX 
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127 96 95 64 63 32 31 0 
127 96 95 64 63 32 31 0 
‘(= [_# [= [ # 
127 96 95 64 63 32 31 0 


rd] min (A3, B3) min (A2, B2) min (A1, B1) min (AO, BO) 


Exceptions: 


None 
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sta BH Parallel Multiply-Subtract Halfword bee BH 


26 25 21 20 16 15 11 10 
FAQSQSesces 
wee ad ead 
C790 
Format: PMSUBH rd, rs, rt 
Purpose: To multiply 8 pairs of 16-bit signed integers and subtract in parallel. 
Description: (rd, Hl, LO) < (HI, LO) — rs x rt 


The eight signed halfwords in GPR rs are multiplied by the eight signed halfwords in GPR 
rt in parallel. The eight word multiply results are subtracted from the corresponding 
words in special registers H/ and LO, and the word results are placed into the 
corresponding words in special registers H/, LO and GPR rd. 


No arithmetic exception occurs under any circumstances. 


This instruction operates on 128-bit registers. 


Restrictions: 


None 

Operation: 
prodO < LO 31.0- GPR[rs]1s5.0 x GPR[rt]15..0 
prod1 < LO 63..32— GPR[rs]s1.16 x GPR[rt]s1..16 
prod2 < HI 31.0— GPRIrs]az..32 x GPR[rt]az..32 
prod3 © HI 63.32 — GPR[rs]e3..48 x GPR[rt]es..48 
prod4. < LO 95..64—- GPR[rs]79.64 x GPR[rt]79..64 
prod5 < LO 127.96 — GPR[rs]gs..20 x GPR[rt]os..s0 
prod6 © HI 95..64 — GPRIrs]111..96 x GPR[rt]111..96 
prod7 © HI 127.96 — GPR[rs]127..112 x GPR[rt]127..112 
LO 31.0 < prod03:1..0 
LO 63.32 < prod13:1..0 
HI 31.0 © prod231..0 
HI 63.32 © prod331..0 


LO 95.64 < prod43z1..0 
LO 127..96 < prod5:1..0 
HI 95..64 < prod6:1..0 
HI 127.96 < prod731..0 
GPR[rd] 31.0 < prod0s:z1..0 
GPR[rd] 63.32 < prod231..0 
GPR[rd] 95.64 < prod43z1..0 
GPR[rd] 127..96¢<— prod6:1..0 
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127 112 111 96 95 80 79 64 63 48 47 32 31 16 15 0 

: 
x x x x x x x x 

127 112 111 96 95 80 79 64 63 48 47 32 31 16 15 0 


‘Lel=[s[e[eslel[ala_ 
0 


127 96 95 64 63 32 31 0 


rd C6 — A6 x BE C4 — A4 x B4 C2 -— A2 x B2 CO — AO x BO 


127 96 95 64 63 32 31 0 


HI C7 - A7xB7 C6 — A6 x B6 C3 — A3 x B3 C2 - A2 x B2 


127 96 95 64 63 32 31 0 


LO C5— A5xB5 C4 - A4 x B4 C1-A1x Bt CO — AO x BO 


Exceptions: 


None 


Programming Notes: 


See the Programming Notes for the PMADDH instruction. 
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i SU BW Parallel Multiply-Subtract Word PM a BW 


26 25 21 20 16 15 11 10 
fot = Te Ds Pa oe 
saa ca ae 
C790 
Format: PMSUBW rd, rs, rt 
Purpose: To multiply 2 pairs of 32-bit signed integers and subtract in parallel. 
Description: (rd, HI, LO) < (HI, LO) -rs x rt 


The low-order signed words of the two doublewords in GPR rs are multiplied by the low- 
order signed words of the two doublewords in GPR rt in parallel. The two 64-bit multiply 
results are subtracted from the contents of special registers H/ and LO. The low-order 
word of the two doubleword results are placed into special register LO, and the high-order 
word of the two doubleword results are placed into special register H/. The two 
doubleword results are placed into GPR rd. 


No arithmetic exception occurs under any circumstances. 


This instruction operates on 128-bit registers. 


Restrictions: 


If either GPR rt or GPR rs do not contain sign-extended 32-bit values (bits 127..95 and 
63..31 equal) then the result of the equation will be undefined. 


Operation: 
if (NotWordValue(GPR[rs]) or NotWordValue(GPR[rt])) then UndefinedResult() endif 
prod0O < (H1 31.0 || LO31..0) - GPR[rs]31.0 x GPR[rt]s1..0 
prod1 < (Hl95.64 || LOos..64) - GPR[rs]os..64 x GPR[rt]os..64 
L O¢3..0 < (prod031) || prodQs1..0 
Hl63..0 < (prod0e3) || prod0es..32 
LO127..64 < (prod131)? || prod1s31..0 


H1 127.64 < (prod1es3) || prodles.32 
GPRIrd]e3.0 < prod06e..o 
GPR[rd]127..64 — prod1eés..o 
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127 96 95 64 63 32 31 0 

: 
x x 

127 96 95 64 63 32 31 0 
‘(= [l= f= 

127 96 95 64 63 32 31 0 
: 

127 96 95 64 63 32 31 0 
0 

127 64 63 0 
rd (C6 || C4) — A2 x B2 (C2 || CO) — AO x BO 

127 96 95 64 63 32 31 0 

127 96 95 64 63 32 31 0 


Exceptions: 


None 


Programming Notes: 


See the Programming Notes for the PMADDH instruction. 
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els Parallel Move To HI Register Auli 


26 25 21 20 11 10 
fo = [nn [6 | 
011 : 00 0000000000 01 = 1 = 001 
C790 
Format: PMTHI rs 
Purpose: To copy a GPR to the special purpose register HI. 
Description: Hl < rs 


The contents of GPR rs are loaded into special register H/. 


This instruction operates on 128-bit registers. 


Restrictions: 
None 
Operation: 
H1127..0 <GPRIrshz7..0 


Exceptions: 


None 
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PMTHL.fmt Parallel Move To HI / LO Register PMTHL.fmt 


31 26 25 21 20 11 10 6 5 0 
MMI 0 PMTHL 
6 5 10 5 6 
C790 
Format: PMTHL.LW rs (fmt = 0) 
Purpose: To copy a GPR to the special registers HI / LO. 
Description: HI/LO < rs 


The contents of GPR rd are loaded into special register H/ / LO. 


This instruction operates on 128-bit registers. 


Restrictions: 
None 
Operation: 
if (fmt =0) then 
LO31..0 <GPRIrs]s1..0 


L O63..32 <L O063..32 
HI 31.0 <—GPRIrsle3..32 
H1 63.32 <H163..32 


L Oos..64 <—GPRIrslbs..64 
L O127..96 <LOi127..96 


Hl] 95.64 <—GPRI[rs]127..96 
H1 127..96 <H1127..96 
endif 
96 95 64 63 32 31 


cc 


[acenen |e | weno | 


Exceptions: 


None 
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ee Parallel Move To LO Register eater 


26 25 21 20 11 10 
FA ee 
011 ae 0000000000 01 uid ss 001 
C790 
Format: PMTLO rs 
Purpose: To copy a GPR to the special register LO. 
Description: LO< rs 


The contents of GPR rs are loaded into special register LO. 


This instruction operates on 128-bit registers. 


Restrictions: 
None 
Operation: 
LOu127..0 <GPRIrshz7..0 


Exceptions: 


None 
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i ULTH Parallel Multiply Halfword jis ULTH 


26 25 21 20 16 15 11 10 
cote Te Ds Pa 
eee a suet 
C790 
Format: PMULTH rd, rs, rt 
Purpose: To multiply 8 pairs of 16-bit signed integers in parallel. 
Description: (rd, LO, Hl) < rs x rt 


The eight signed halfwords in GPR rs are multiplied by the eight signed halfwords in GPR 
rt, in parallel. The eight word results are placed into special register H/, LO and GPR rd. 


No arithmetic exception occurs under any circumstances. 


This instruction operates on 128-bit registers. 


Restrictions: 


None 

Operation: 
prodoO < GPRIrs]is.o x GPR[rt]1s..0 
prod1 < GPRIrs]Js1..16 x GPR[rt]s1..16 
prod2 < GPR[rs]az..32 x GPR[rt]az..32 
prod3 < GPRIrs]Je3..48 x GPR[rt]es..48 
prod4. < GPRIrs]79..64 x GPR[rt]79..64 
prod5 < GPRIrs]Jos..20 x GPR[rt]g5..80 
prod6 < GPRIrs]11..96 x GPR[rt]111..96 
prod7 < GPRI[rs]127..112 x GPR[rt]127..112 
LO 31.0 < prod03:1..0 
LO 63.32 < prod13:1..0 
HI 31.0 < prod231..0 
HI 63.32 © prod331..0 


LO 95.64 < prod43z1..0 
LO 127.96 < prod5:1..0 
HI 95..64 < prod6:1..0 
HI 127.96 < prod731..0 
GPR[rd]s1.0 < prod03s..0 
GPR[rd]e3.32 < prod231..0 
GPR[rd]os.64 < prod43z1..0 
GPR[rd]127..96 < prod6:1..0 
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127 112 111 96 95 80 79 64 63 48 47 32 31 16 15 0 
: 


x x x x x x x x 
127 _=*:112 «111 96 95 80 79 64 63 48 47 32 31 16 15 0 


Exceptions: 


None 


Programming Notes: 


See the Programming Notes of the PMADDH instruction. 
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PMULTUW Parallel Multiply Unsigned Word PMULTUW 


31 26 25 21 20 16 15 11 10 65 0 
MMI PMULTUW MMI3 
6 5 5 5 5 6 
C790 
Format: PMULTUW td, rs, rt 
Purpose: To multiply 2 pairs of 32-bit unsigned integers in parallel. 
Description: (rd, LO, Hl) < rs x rt 


The low-order unsigned words of the two doublewords in GPR rs are multiplied by the 
low-order unsigned words of the two doublewords in GPR rt in parallel. The low-order 
word of the two doubleword result is placed into special register LO, and the high-order 
word of the two doubleword result is placed into special register H/. The two doubleword 
results are placed into GPR rd. 


No arithmetic exception occurs under any circumstances. 
This instruction operates on 128-bit registers. 
Restrictions: 


If either GPR rt or GPR rs do not contain zero-extended 32-bit values (bits 127..96 and 
63..32 equal zero) then the result of the equation will be undefined. 


Operation: 
if (NotWordValue (GPR[rs]) or NotWordValue (GPR[rt])) then UndefinedResult() endif 
prodO < (0]|| GPR[rs]s1..0) x (0 || GPR[rt]3:..0) 
prod1 < (0]|| GPR[rs]gs..64) x (0 || GP R[rt]gs..64) 
L O¢3..0 < (prod0 31)? || prod0s1..0 
H1 63.0 < (prod0 63)? || prod0es..32 
LO127..64 < (prod1 31) || prod1s1..0 
H1 127..64 < (prod1 63)? || prod1es.32 


GPR[rd]e3.0 < prodO 
GPR[rd]127..64 < prod1 


127 96 95 64 63 32 31 0 
: 

127 96 95 a 64 63 382 31 é 0 
a 

127 64 63 0 
rd (0 || A2) x (0 || B2) (0 || AO) x (0 || BO) 

127 96 95 64 63 32 31 0 

127 96 95 64 63 382 31 0 
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Exceptions: 


None 


Programming Notes: 


See the Programming Notes of the PMADDH instruction. 
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PMULTW Parallel Multiply Word PMULTW 


31 26 25 21 20 1615 1110 65 0 
MMI PMULTW MMI2 
6 5 5 5 5 6 
C790 
Format: PMULTW rd, rs, rt 
Purpose: To multiply 2 pairs of 32-bit signed integers in parallel. 
Description: (rd, LO, Hl) < rs x rt 


The low-order signed words of the two doublewords in GPR rs are multiplied by the low- 
order signed words of the two doublewords in GPR rt in parallel. The low-order word of 
the two doubleword results is placed into special register LO, and the high-order word of 
the two doubleword results is placed into special register H/. The two doubleword results 
are placed into GPR rd. 


No arithmetic exception occurs under any circumstances. 
This instruction operates on 128-bit registers. 


Restrictions: 


If either GPR rt or GPR rs do not contain sign-extended 32-bit values (bits 127..95 and 
63..31 equal) then the result of the equation will be undefined. 


Operation: 
if (NotWordValue (GPR[rs]) or NotWordValue (GPR[rt])) then UndefinedResult() endif 
prodO < GPRIrs]s1..0 x GPR[rt]s1..0 
prod1 < GPRIrs]Jos..64 x GPR[rt]os..64 
L O¢3..0 < (prod0 31)? || prod0s:..0 
H1 63.0 < (prod0 63)? || prod0es..32 


) 
LO127..64 < (prod1 31) || prod1s1..0 
H1 127.64 < (prod1 63)? || prod1es.32 
GPR[rd]e3.0 < prodO 
GPR[rd]127..64 < prod1 


127 96 95 64 63 32 31 0 
. 

127 96 95 a 64 63 32 31 a 0 
a 

127 64 63 0 
rd A2 x B2 AO x BO 

127 96 95 64 63 32 31 0 

127 96 95 64 63 32 31 0 
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Exceptions: 


None 


Programming Notes: 


See the Programming Notes of the PMADDH instruction. 
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PNOR Parallel Not Or PNOR 


31 26 25 21 20 16 15 11 10 6 5 0 
011100 10011 101001 
6 5 5 5 5 6 
C790 
Format: PNOR rd, rs, rt 
Purpose: To do a bitwise logical NOT OR (NOR). 
Description: rd << rs NOR rt 


The contents of GPR rs are combined with the contents of GPR rt in a bitwise logical NOR 
operation. The result is placed into GPR rd. 


This instruction operates on 128-bit registers. 


Operation: 
GPRIrd]i27.0 < GPR[rs]i27..0 nor GPR[rt]127..0 


127 64 63 0 
NOR NOR 
127 64 63 0 


an 
nm 
NI 
op) 
KR 
fon) 
roe) 
ro) 


rd A1 NOR B1 AO NOR BO 


Exceptions: 


None 
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POR Parallel Or POR 


31 26 25 21 20 16 15 11 10 6 5 0 
011100 10010 101001 
6 5 5 5 5 6 
C790 
Format: POR rd, rs, rt 
Purpose: To do a bitwise logical OR. 
Description: rd<rsORrt 


The contents of GPR rs are combined with the contents of GPR rt in a bitwise logical OR 
operation. The result is placed into GPR rd. 


This instruction operates on 128-bit registers. 


Operation: 
GPR[Irdhi27.0 < GPR[rsi27..0 or GPR[rt]127..0 


127 64 63 0 
OR OR 
127 64 63 0 


an 
nm 
NI 
op) 
KR 
fon) 
roe) 
ro) 


rd A1 oR B1 AO oR BO 


Exceptions: 


None 
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26 25 


21 20 


TX 
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Parallel Pack to 5-bits 


prewce 


16 15 11 10 


MMI PPAC5 MMIO 
eee 0009 out ia 


Format: 
Purpose: 


Description: 


PPACS5 rd, rt 
To truncate and pack data into consecutive 5-bits. 


rd — pack (rt) 


C790 


The four 32-bit words (8, 8, 8, 8 bit) in GPR rt are packed into the four 16-bit halfwords (1, 
5, 5, 5 bit). The results are placed into GPR rd. See diagram on next page. 


This instruction operates on 128-bit registers. 


Operation 


GPR[rd]a..o 
GPR[rdJo..5 
GPRIrd]14..10 
GPRIrd]is 
GPRIrd]Bs1..16 
GPR[rd]se..32 
GPR[rd]a1..37 
GPR[rd]ae..42 
GPRIrd]a7 
GPRIrdl]es..48 
GPRIrdles..64 
GPR[rd]73..69 
GPRIrd]7s..74 
GPRI[rd]79 
GPRI[rd]os..80 


< GPR[rt]7.3 
+ GPR[rt]1s..11 
+ GPR[rt]23..19 
< GPR[rt]s1 
— ole 

<+ GPRI[rt]39..35 
+ GPR[rt]a7..43 
< GPR[rt]s5..51 
<+ GPRIrtle3 
— ole 

+ GPR[rt]71..67 
< GPR[rt]79..75 
<+ GPR[rt]s7..83 
< GPR[rt]os 
— ole 


GPRIrd]io0.96 < GPR[rt]103..99 
GPR[rd]1os..101 <— GPR[rt]111..107 
GPRIrd]110.106 < GPR[rt]119..115 


GPR[rd]111 


< GPR[rt]127 


GPR[rd]127.112 <— 01 
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[Overview] 
127 96 95 64 63 32 31 0 


rt 


19 18 16 15 


1bit Sbit Sbit Sbit 


Exceptions: 


None 
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r p AC B Parallel Pack to Byte A Pp AC B 


26 25 21 20 16 15 11 10 
ote = Te Ds Pear] oa 
sa a ae 
C790 
Format: PPACB rtd, rs, rt 
Purpose: To pack into consecutive bytes. 
Description: rd < pack (rs, rt) 


The low-order bytes of the eight halfwords in GPR rs are packed into consecutive bytes of 
the high-order doubleword in GPR rd. Similarly, the low-order bytes of the eight halfwords 
in GPR rt are packed into consecutive bytes of the low-order doubleword in GPR rd. 


This instruction operates on 128-bit registers. 


Operation: 


GPRIrd]7.0 <—GPRIrt]7..0 
GPRIrd]is.8 << GPR[rt]23..16 
GPR[rd]23.16 < GPR[rt]s9..32 
GPR[rd]s1.24 < GPR[rt]s5..48 
GPR[rd]39.32 < GPR[rt]71..64 
GPRIrd]47.40 < GPR[rt]s7..80 
GPR[rd]ss.48 < GPR[rt]103..96 
GPRIrd]e3.56 < GPR[rt]119..112 
GPR[rd]71.64 < GPR[rs]7..o 
GPR[rd]79.72. < GPRIrs]23..16 
GPRIrd]s7.so < GPR[rs]s39..32 
GPRIrd]os.s8 < GPR[rs]55..48 
GPR[rd]io3.96 < GPR[rs]71..64 
GPR[rd]111..104 <- GPR[rs]s7..80 
GPR[rd]119..112 <- GPR[rs]103..96 
GPR[rd]127.120 < GPR[rs]119..112 


127 120119112111 104103 9695 88 87 80 79 72 71 64 63 56 55 48 47 40 39 32 31 24 23 1615 87 0 
Ems 


Exceptions: 


None 
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aaa Parallel Pack to Halfword ial 


26 25 21 20 16 15 11 10 
cote Te Ds Pe oe 
wee se ae 
C790 
Format: PPACH rd, rs, rt 
Purpose: To pack into consecutive halfwords. 
Description: rd <— pack (rs, rt) 


The low-order halfwords of the four words in GPR rs are packed into consecutive 
halfwords of the high-order doubleword in GPR rd. Similarly, the low-order halfwords of 
the four words in GPR rt are packed into consecutive halfwords of the low-order 
doubleword in GPR rd. 


This instruction operates on 128-bit registers. 


Operation: 


GPRIrdlis.o << GPR[rt]1s..0 
GPRIrd]31.16 < GPR[rt]47..32 
GPR[rd]a7.32 < GPR[rt]79..64 
GPRIrd]e3.48 < GPR[rt]111..96 
GPRIrd]79.64 << GPRIrs]1s..0 
GPRIrd]os.s0 < GPR[rs]a7..32 
GPR[rd]i11.96 < GPR[rs]79..64 
GPR[rd]127.112 <_ GPR[rs]111..96 


127 112 111 96 95 80 79 64 63 48 47 32 31 16 15 0 


32. 31 16 15 


Exceptions: 


None 
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= PACW Parallel Pack to Word Pp pee 


26 25 21 20 16 15 11 10 
cote Pe Ds P| oe 
wee saul ae 
C790 
Format: PPACW rd, rs, rt 
Purpose: To pack into consecutive words. 
Description: rd < pack (rs, rt) 


The low-order words of the two doublewords in GPR rs are packed into consecutive words 
of the high-order doubleword in GPR rd. Similarly, the low-order words of the two 
doublewords in GPR rt are packed into consecutive words of the low-order doubleword in 
GPR rd. 


This instruction operates on 128-bit registers. 


Operation: 


GPRIrd]s1.0 < GPR[rt]s1..0 
GPRIrd]e3.32 < GPR[rt]os..64 
GPRIrd]Jos.64 < GPRIrs]s1..0 
GPR[rd]i27.96 < GPR[rs]os..64 


Exceptions: 


None 
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eee Parallel Reverse Halfword Pnev 


26 25 21 20 16 15 11 10 
FAPSQSescde 
011 ae 00000 nal ae 
C790 
Format: PREVH rd, rt 
Purpose: To reverse halfwords. 
Description: rd < reverse (rt) 


The four high-order halfwords in GPR rt are reversed and the four low-order halfwords in 
GPR rt are reversed. The results are placed into GPR rd. 


This instruction operates on 128-bit registers. 


Operation: 


GPRIrd]is.o < GPR[rt]es3..48 
GPRIrd]31.16 < GPR[rt]47..32 
GPR[rd]a7.32 < GPR[rt]s1.16 
GPR[rd]e3.48 < GPR[rt]is..0 
GPRIrd]79.64 < GPR[rt]127..112 
GPRIrd]Jos.s0 < GPR[rt]111..96 
GPR[rd]111.96 < GPR[rt]os..80 
GPR[rd]127.112 <— GPR[rt]79..64 


127 112 111 96 95 80 79 64 63 48 47 32 31 16 15 


Exceptions: 


None 
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. ROT 3W Parallel Rotate 3 Words Lett Pp aoe 3W 


26 25 21 20 16 15 11 10 
ot [be [oe |= P| 
= 00000 = sl 
C790 
Format: PROT3W rd, rt 
Purpose: To rotate words. 
Description: rd < rotate (rt) 


The three low-order words in GPR rt are rotated to the right. The results are placed into 
GPR rd while the other word is copied directly to the corresponding word in GPR rd. 


This instruction operates on 128-bit registers. 


Operation: 
GPRIrd]s1.0 < GPRIrt]e3..32 
GPRIrdle3.32 < GPR[rt]os..64 
GPRIrd]os.64 < GPR[rt]s1..0 
GPRIrd]i27.96 < GPR[rt]127..96 


127 96 95 64 63 32 31 0 


Exceptions: 


None 
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ican? Parallel Shift Left Logical Halfword ial 


26 25 21 20 16 15 11 10 
ee eee 
wee 00000 a 
C790 
Format: PSLLH rd, rt, sa 
Purpose: To logically shift left 8 halfwords by a fixed number of bits, in parallel. 
Description: rd< tt<<sa (logical) 


The eight halfwords in GPR rt are shifted left in parallel, inserting zeros into the emptied 
bits; the results are placed into the corresponding eight halfwords in GPR rd. The bit shift 
count is specified by the low-order four bits of sa. 


This instruction operates on 128-bit registers. 


Operation: 


S © SQa3..0 

GPRIrd]is.o << GPR[rt]«a5-s)..0 || OS 
GPR[rd]s1.16 < GPR[rt]31-s)..16 || 05 
GPR[rd]a7.32 << GPR[rt](a7-s).32 || OS 
GPR[rd]e3.48 < GPR[rt]3-s)..8 || 05 
GPRIrd]79.64 < GPR[rt](79-5s)..64 || OS 
GPR[rd]os.s0 < GPR[rt]5-s).80 || 05 
GPR[rd]ii1.96 < GPR[rt]11-s)..96 || OS 
GPR [rd]i27..112 < GPR[rt]127-s)..112 || O§ 


127 112 111 96 95 80 79 64 63 48 47 32.31 16 15 0 


s bit s bit s bit s bit s bit s bit s bit s bit 


Exceptions: 


None 
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For Parallel Shift Left Logical Variable Word oun 


26 25 21 20 16 15 11 10 
FAGSQses Coes 
wee cat ead 
C790 
Format: PSLLVW rd, rt, rs 
Purpose: To logically shift left 2 words by a variable number of bits, in parallel. 
Description: rd<rt<<rs (logical) 


The low-order words of the two doublewords in GPR rt are shifted left in parallel, 
inserting zeros into the emptied bits; the results are placed into the corresponding two 
words in GPR ra. The bit shift counts are specified by the low-order five bits of the two 
doublewords in GPR rs. 


This instruction operates on 128-bit registers. 


Operation: 
sO < GPRIrs]a..o 
sl < GPRIrsles..64 
temp0 < GPR[rt]1-s0)..0 || 06° 
temp1 < GPR[rt]5-s1)..64 || 054 


GPR[rd]e3.0 << (temp03:1)? || temp031..0 
GPR[rd]127..64 < (temp131)* || temp131..0 


127 68 64 63 4 0 
re | |: 
127 96 95 64 63 32 31 0 


127 9/9 0 
eM > <> 


si bit sO bit 


Exceptions: 


None 
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vicar Parallel Shift Left Logical Word iaaail 


26 25 21 20 16 15 11 10 
oe ge |e [= TS 
wee 00000 ee 
C790 
Format: PSLLW rd, rt, sa 
Purpose: To logically shift left 4 words by a fixed number of bits, in parallel. 
Description: rd< tt<<sa (logical) 


The four words in GPR rt are shifted left by five bits of sain parallel, inserting zeros into 
the emptied bits; the results are placed into the corresponding four words in GPR rd. 


This instruction operates on 128-bit registers. 


Operation: 


S < S$a4..0 

GPRIrd]31.0 < GPR[rt]@z-s)..0 || O& 
GPRIrdJe3.32 < GPR[rt]63-s)..32 || 05 
GPRIrdJos..64 < GPR[rt]es-s)..64 || 05 
GPR[rd]i27..06 <— GPR[rt](127-s)..96 || 05 


127 96 95 64 63 32 31 


<<» <>» 
s bit Ss peer ae Ss bit s bit 


Exceptions: 


None 
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i RAH Parallel Shift Right Arithmetic Halfword = RAH 


26 25 21 20 16 15 11 10 
ie Loe | ee Le ee 
ee 00000 nee 
C790 
Format: PSRAH rd, rt, sa 
Purpose: To arithmetically shift right 8 halfwords by a fixed number of bits, in parallel. 
Description: rd< rt>>sa (arithmetic) 


The eight halfwords in GPR rt are shifted right by sa bits in parallel sign extending the 
high order bits; the results are placed into the corresponding eight halfwords in GPR rd. 
The bit shift count is specified by the low-order four bits of sa. 


This instruction operates on 128-bit registers. 


Operation: 


S © Sa3..0 

GPR[rd]1s..o 
GPR[rd]s31..16 GPRIrt]s1 
GPRIrd]a7..32 GPR[rt]a7 


< (GPR[rt]as) 

<— ( ) 

<— ( ) 

GPRIrd]e3.48 < (GPRI[rt]es) 
< ( ) 

<— ( ) 

( 

( 


S || GPR[rt]1s..s 

S || GPR[rt]31..:16+4s) 
S || GPR[rt]47..32+5) 
S || GPR[rt]es..:48+s) 
GPR[rd]79..64 GPR[rt]79)s || GP R[rt]79...644s) 
P ) 
) 
} 


GPR[rdJos..s0 GPR[rt]9s5)s || GP R[rt]9s..:804s 


127 112 111 96 95 80 _79 64 63 48 47 32 31 16 15 0 


se A eA fav fav ne] 8 v2 fv] wt fv 


s bit s bit s bit s bit s bit s bit s bit s bit 


Exceptions: 


None 
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i RAVW Parallel Shift Right Arithmetic Variable Word ae RAVW 


26 25 21 20 16 15 11 10 
FArsrsescoe 
ee wal uaa 
C790 
Format: PSRAVW rd, rt, rs 
Purpose: To arithmetically shift right 2 words by a variable number of bits, in parallel. 
Description: rd<rt>>rs (arithmetic) 


The low-order words of the two doublewords in GPR rt are shifted right in parallel, sign 
extending the high order bits; the results are placed into the corresponding two words in 
GPR rd. The bit shift counts are specified by the low-order five bits of the two doublewords 
in GPR rs. 


This instruction operates on 128-bit registers. 


Operation: 
sO < GPR[rs]a.o 
sl — GPRIrsles..64 
temp0O <— (GPR[rt]s1)°° || GPR[rt]s1..so 
temp1 <— (GPR[rt]9s)*" || GPR[rt]os..(64451) 
GPR[rd]e3.0 < (temp031) || temp031..0 
GPR[rd]127..64 < (temp131)* || temp131..0 


127 68 64 63 4 0 


127 96 95 64 63 32 31 0 


sign sign 
[ee DN | ee 


Seip bit “30 bit bit 


Exceptions: 


None 
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os RAW Parallel Shift Right Arithmetic Word = RAW 


26 25 21 20 16 15 11 10 
ote fe |e = P= 
wee 00000 eee 
C790 
Format: PSRAW rd, rt, sa 
Purpose: To arithmetically shift right 4 word by a fixed number of bits, in parallel. 
Description: rd<rt>>sa (arithmetic) 


The four words in GPR rt are shifted right by five bits of sain parallel, sign extending the 
high order bits; the results are placed into the corresponding four words in GPR rd. 


This instruction operates on 128-bit registers. 


Operation: 


S < Sa4..0 

GPR[rd]s1.0 < (GPR[rt]s:1)s || GPR[rt]s1.s 
GPR[rd]e3.32 < (GPR[rt]e3)s |] GP Rirt]es..32+5) 
GPRIrd]os.64 < (GPR[rt]os)s |] GP Rirt]os..(6445) 
GPR[rd]127.06 — (GPR[rt]127)s || GP R[rt]127...964s) 


127 as os ae 32.31 


{sme [ 


Ss bit Ss bit s bit s bit 


Exceptions: 


None 
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a 


26 25 21 20 16 15 


TX 
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Parallel Shift Right Logical Halfword on LH 


11 10 


MMI Rea 
wee 0009 eee 


Format: 
Purpose: 


Description: 


rdd<rt>>sa 


PSRLH rd, rt, sa 
To logically shift right 8 halfwords by a fixed number of bits, in parallel. 


(logical) 


C790 


The eight halfwords in GPR rt are shifted right by sa bits, in parallel, inserting zeros into 
the high order bits; the results are placed into the corresponding eight halfwords in GPR 
rd. The bit shift count is specified by the low-order four bits of sa. 


This instruction operates on 128-bit registers. 


Operation: 


S © Sa3..0 

GPRI[rd]1s..o 
GPR[rd]s1..16 
GPR[rd]az..32 
GPR[rd]es..48 
GPR[rd]79..64 
GPR[rd]os..s0 


GPR[rd]i11.96 < OS || GPR[rt]111.:9645) 
GPR[rd]127..112 < O§ || GPR[rt]127..112-5) 


127 ca rae 


< 0° || GPR[rt]15..s 

< 0° || GPR[rt]s1..1645 
< 08 || GPR[rt]47..(3245 
< 05 || GPR[rt]es..(434s 
< 08 || GPR[rt]79..(644s 
< 0° || GPR[rt]9s..(804s) 


i 95 80 79 


rt 


rd 


Exceptions: 


None 


faa alae ese ee eee 


WMD 


<— mM Mm <— 


s bit s bit s bit s bit 


64 63 48 47 32.31 Ed a = 


<> <> << << 
s bit s bit s bit s bit 
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Foe LVW Parallel Shift Right Logical Variable Word in LVW 


26 25 21 20 16 15 11 10 
FAs 
wee wan ead 
C790 
Format: PSRLVW rd, rt, rs 
Purpose: To logically shift right 2 words by a variable number of bits, in parallel. 
Description: rd<rt>>rs (logical) 


The low-order words of the two doublewords in GPR rt are shifted right in parallel, 
inserting zeros into the high order bits. The results are sign extended; the results are 
placed into the corresponding two words in GPR rd. The bit shift counts are specified by 
the low-order five bits of the two doublewords in GPR rs. 


This instruction operates on 128-bit registers. 


Operation: 


sO < GPR[rs]a4.o0 

sl — GPRIrsles..64 

temp0O < 0° || GPR[rt]s1..so 

temp1 < 0% || GPR [rt]9s..(64451) 
GPR[rd]e3.0 < (temp031)2 || tempO 31.0 
GPR[rd]127.64 < (temp131) || temp1 31.0 


127 68 64 63 4 0 
— ie: (2 (ene 7) 
127 96 95 64 63 32 31 0 


a 


$1 bit sO bit 


Exceptions: 


None 
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pat LW Parallel Shift Right Logical Word ial LW 


26 25 21 20 16 15 11 10 
ote Lge |e Ts P= Ta 
wee 00000 — 
C790 
Format: PSRLW rd, rt, sa 
Purpose: To logically shift right 4 words by a fixed number of bits, in parallel. 
Description: rd< tt>>sa (logical) 


The four words in GPR rt are shifted right by five bits of sa, in parallel, inserting zeros 
into the high order bits; the results are placed into the corresponding four words in GPR 
rd. 


This instruction operates on 128-bit registers. 


Operation: 


Ss < Sa4..0 

GPR[rd]31.0 < O§|| GPR[rt]s1.s 
GPR[rd]e3.32 < OS || GPR[rt]e3...32-45) 
GPR[rd]os.64 < OS || GPR[rt]os..64+5) 
GPR[rd]127..96 < O§ || GPR[rt]127...96+s) 


127 96 95 64 63 32 31 0 


Exceptions: 


None 
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a U B B Parallel Subtract Byte ie U B B 


26 25 21 20 16 15 11 10 
cote Ts Ts or] oa 
eee wd ae 
C790 
Format: PSUBB rd, rs, rt 
Purpose: To subtract 16 pairs of 8-bit integers in parallel. 
Description: rd<rs—rt 


The sixteen signed byte values in GPR rt are subtracted from the corresponding sixteen 
byte values in GPR rs in parallel. The results are placed into the corresponding sixteen 
bytes in GPR rd. 


No overflow or underflow exceptions are generated under any circumstances. 


This instruction operates on 128-bit registers. 


Operation: 
GPRIrd]7..o < (GPRIrs]7..0— GPR[rt]7..0)7..0 
GPRIrd]is.s < (GPR[rs]is.s— GPR[rt]15..8)7..0 
GPRIrd]23..16 GPRI[rs]23..16 - GPR[rt]23..16)7..0 
GPRIrd]s1..24 GPR[rs]s1..24 — GPR[rt]s1..24)7..0 
GPRIrd]s9..32 GPRIrs]s9..32 —- GPR[rt]39..32)7..0 
GPRIrd]a7..40 GPRI[rs]47..40 — GPR[rt]47..40)7..0 
GPRIrd]ss..48 


GPRIrs]e3..56 - GPR[rt]e3..56)7..0 
GPRI[rs]71..64 - GPR[rt]71..64)7..0 
GPRIrs]79..72 — GPR[rt]79..72)7..0 
GPRIrs]s7..30 - GPR[rt]s7..80)7..0 
GPRIrs]os..8 - GPR[rt]o5..88)7..0 
GPRI[rs]103..96 - GPR[rt]103..96)7..0 
GPR[rs]111..104 — GPR[rt]111..104)7..0 
GPR[rs]119..112 — GPR[rt]119..112)7..0 
GPR[rs]127..120 —- GPR [rt]127..120)7..0 


GPR[rd]71..64 
GPR[rd]79..72 
GPRI[rd]s7..so 
GPR[rdJos..ss 
GPR[rd]i03.96 < 
GPR[rd]111.104 << 
GPR[rd]i19..112 <— 
GPR[rd]127..120 << 


eo )7..0 
ec ) 
< ) 
ec ) 
< (GPRIrs]s55.48 — GPR[rt]s5..48)7..0 

GPRIrd]ea.s6 <— ) 
eo ) 
ce ) 
e ) 
os 


nnn enn innit XT LT 


127__ 120 119 112111104 103 96 95 8887 80 79 72 71 6463 5655 4847 40 39 3231 24 23 1615 87 0 
: 


127120119 112111104 103 96 95 8887 80 79 72 71 6463 5655 4847 40 39 3231 24 23 1615 87 0 
«[eis] v4] sie] 2] 1+] 10] 5 | s6 [er | se [=] | | | or 


127__120 119 112 111104 103 96 95 8887 80 79 72 71 6463 5655 4847 40 39 32 31 <4 23 1615 8 7 
rd} AIS | Ata | Ata | At2 Alt A10 AQ A8 A7 AG AS A4 A2 Al AO 
B15 | B14 | B13 | B12 B11 B10 BQ BB B7 B6 BS B4 B3 B2 Bt BO 


Exceptions: 


None 
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a U B H Parallel Subtract Halfword ies U B H 


26 25 21 20 16 15 11 10 
ote Te Ts P| a 
mee = ia 
C790 
Format: PSUBH rd, rs, rt 
Purpose: To subtract 8 pairs of 16-bit integers in parallel. 
Description: rd<rs—rt 


The eight signed halfwords in GPR rt are subtracted from the corresponding eight 
halfwords in GPR rs in parallel. The results are placed into the corresponding eight 
halfwords in GPR ra. 


No overflow or underflow exceptions are generated under any circumstances. 


This instruction operates on 128-bit registers. 


Operation: 


GPR[rd]1s..0 

GPR[rd]s1..16 
GPR[rd]az..32 
GPR[Ird]es..48 
GPR[rd]79..64 
GPR[rdJos..s0 


127 112 111 96 95 80 79 64 63 48 47 32 31 16 15 0 


127 112 111 96 95 80 79 64 63 48 47 32_31 16 15 0 


fe lefe~s|elelals 


127 112 111-96 95 80_79 64 63 48 47 32 31 16 15 0 
rd A7-B7 A6-B6 | A5-B5 | A4—B4 | A3-B3 A2-B2 A1-B1 AO-BO 


Exceptions: 


None 
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Be BSB Parallel Subtract with Signed saturation Byte i BSB 


26 25 21 20 16 15 11 10 
ote Te Ts P| oe 
eee a ae 
C790 
Format: PSUBSB rd, rs, rt 
Purpose: To subtract 16 pairs of 8-bit signed integers with saturation in parallel. 
Description: rd <rs-—rt 


The sixteen signed bytes in GPR rt are subtracted from the corresponding sixteen signed 
bytes in GPR rsin parallel. The results are placed into the corresponding sixteen bytes in 
GPR rd. 


No overflow or underflow exceptions are generated under any circumstances. Results 
beyond the range of a signed byte value are saturated according to the following: 


Overflow: Ox7F 
Underflow: 0x80 


This instruction operates on 128-bit registers. 


Operation: 


if (GPR[rs]z..0 — GPR[rt]z..0) >Ox7F) then 
GPR[rd]7.0 < Ox7F 

else if (0x100 <=(GPRIrs]7..0-— GPR[rt]7..0) <Ox180) then 
GPR[rd]7.0 < Ox80 


else 
GPRI[rd]v.0 <— (GPRIrs]7..0— GPR[rt]7..0)7..0 
endif 
if (GPR[rs]is.. — GPR[rt]is..3) >Ox7F ) then 
GPR[rdis..s < Ox7F 
else if (Ox100 <=(GPR[rs]is..s - GPR[rt]is..3) <0x180) then 
GPRIrdlis..s < 0x80 
else 
GPRIrdlis..s < (GPRIrs]is..3 — GPR[rt]15..8)7..0 
endif 
if ((GPR[rs]23.16 — GPR[rt]23..16) >Ox7F ) then 
GPRIrd]23..16 < Ox7F 
else if (0x100 <=(GPR[rs]23..16 - GPR[rt]23..16) <Ox180) then 
GPRIrd]23..16 < 0x80 
else 
GPR[rd]23..16 < (GPRIrs]23..16 - GPR[rt]23..16)7..0 
endif 
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if ((GPR[rs]s31..24 — GPR[rt]s1..24) >Ox7F ) then 


GPR[rd]s1..24 < Ox7F 
else if (0Ox100 <=(GPR[rs]s1..24 - GPR[rt]31.24) <Ox180) then 
GPR[rd]s1..24 < 0x80 
else 
GPR[rd]s1..24 < (GPRIrs]s1..24 — GPR[rt]s1.24)7..0 
endif 
if ((GPR[rs]39..32 — GPR[rt]39..32) >Ox7F ) then 
GPR[rd]Bso..32 < Ox7F 
else if (Ox100 <=(GPRIrs]39..32 — GPR[rt]39..32) <Ox180) then 
GPR[rd]so..32 < 0x80 
else 
GPR[rd]Bso..32 < (GPRIrs]s39..32 — GPR[rt]39..32)7..0 
endif 
if ((GPR[rs]a7..40 — GPR[rt]47..40) > Ox7F ) then 
GPR[rd]az..40 < Ox7F 
else if (Ox100 <=(GPR[rs]a7..40 - GPR[rt]47..40) <0x180) then 
GPRIrd]a7..40 < 0x80 
else 
GPRIrd]a7..40 < (GPR[rs]az7..40 — GPR[rt]a7..40)7..0 
endif 
if ((GPR[rs]ss..48 — GPR[rt]s55..48) >Ox7F ) then 
GPR[rd]Bss..48 < Ox7F 
else if (Ox100 <=(GPRIrs]s5..48 — GPR[rt]55..48) <0x180) then 
GPRIrd]Jss..48 < 0x80 
else 
GPR[rd]ss..48 < (GPRIrs]s5..48 — GPR[rt]s55..48)7..0 
endif 
if (GPR[rsle3..56 — GPR[rt]e3..56) >Ox7F ) then 
GPR[rd]e3..56 < Ox7F 
else if (0Ox100 <=(GPRIrs]es..56 - GPR[rt]e3.56) <Ox180) then 
GPR[rd]e3..56 < 0x80 
else 
GPRIrd]e3..56 < (GPRIrs]e3..56 -— GPR[rt]e3.56)7..0 
endif 
if ((GPR[rs]71.64 - GPR[rt]71..64) >Ox7F ) then 
GPR[rd]71..64 < Ox7F 
else if (Ox100 <=(GPR[rs]71..64 - GPR[rt]71.64) <0x180) then 
GPRIrd]71..64 < 0x80 
else 
GPR[rd]v71..64 < (GPRIrs]v1..64-— GPR[rt]71..64)7..0 
endif 
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if ((GPR[rs]79..72 — GPR[rt]79..72) >Ox7F ) then 


GPR[rd]79..72 < Ox7F 
else if (Ox100 <=(GPR[rs]v9..72 — GPR[rt]79..72) <Ox180) then 
GPR[rd]9..72 < 0x80 
else 
GPR[rd]J79..72 < (GPRIrs]79..72 — GPR[rt]79..72)7..0 
endif 
if ((GPR[rs]sz..s0 — GPR[rt]s7..80) >Ox7F ) then 
GPR[rd]s7..so < Ox7F 
else if (Ox100 <=(GPRI[rs]s7..0 - GP R[rt]s7..s0) <Ox180) then 
GPRI[rd]s7..so < 0x80 
else 
GPRI[rd]s7..80 < (GPRIrs]s7..20 — GPR[rt]s7..80)7..0 
endif 
if ((GPR[rs]os..88 — GPR[rt]9s..88) >Ox7F ) then 
GPR[rd]os..88 < Ox7F 
else if (0x100 <=(GPRIrs]os..88 - GP R[rt]os..s8) <Ox180) then 
GPR [rd]Jos..88 < 0x80 
else 
GPR[rdJos..ss < (GPRIrs]os..38 — GPR[rt]os..ss)7..0 
endif 
if ((GPR[rs]103..06 — GPR[rt]103..96) >Ox7F ) then 
GPR[rd]103..96 < Ox7F 
else if (Ox100 <=(GPRIrs]103..96 - GPR[rt]103..96) <Ox180) then 
GPR[rd]103..96 < 0x80 
else 
GPR [rd]103..96 < (GPR[rs]103..96 - GPR[rt]103..96)7..0 
endif 
if ((GPR[rs]111..104 — GPR[rt]111..104) >Ox7F) then 
GPR[rd]111..104 < Ox7F 
else if (Ox100 <=(GPRI[rs]111..104 — GPR[rt]111..104) <0x180) then 
GPR[rd]111..104 < 0x80 
else 
GPR[rd]111..104 < (GPR[rs]111..104 — GPR[rt]111..104)7..0 
endif 
if ((GPR[rs]119..112 — GPR[rt]119..112) >Ox7F ) then 
GPR[rd]119..112 < Ox7F 
else if (Ox100 <=(GPRI[rs]119..112 — GPR[rt]119..112) <0x180) then 
GPR[rd]119..112 < 0x80 
else 
GPR[rd]119..112 < (GPR[rs]119..112 — GPR[rt]119..112)7..0 
endif 
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if ((GPR[rs]127..120 — GPR[rt]127..120) >Ox7F) then 


GPR[rd]127..120 < Ox7F 
else if (Ox100 <=(GPRI[rs]127..120 — GPR [rt]127..120) <0x180) then 
GPR[rd]127..120 < 0x80 
else 
GPR[rd]127..120 < (GPR[rs]127..120 —- GPR [rt]127..120)7..0 
endif 


127_ 120119 112 111 104 103 96 95 88 87 80 79 72 71 64 63 56 55 48 47 40 39 32 31 24 23 1615 8/7 0 
cfas[ ae] aif alan [ao] 90 [8 [7] ] [0 |S |e] 8 [a 


127_ 120119 112 111 104 103 96 95 88 87 80 79 72 71 64 63 56 55 48 47 40 39 32 31 24 23 1615 87 0 


eis] sra[sre| eva[ sis |v] eo] se [or [=6 [os [|e | [or 


127_ 120119 112 111 104 103 96 95 88 87 80 79 72 71 64 63 56 55 48 47 40 39 32 31 24 23° 1615 87 0 


rd A15 A14 A13 A12 Al1 A10 AQ A8 A7 A6 AS A4 A3 
B15 B14 B13 B12 B11 B10 Bg B8& B7 B6 BS B4 B3 


* Saturate to signed byte 


Exceptions: 


None 
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new BSH Parallel Subtract with Signed Saturation Halfword fe BSH 


26 25 21 20 16 15 11 10 
FARScecscir 
a iu su 
C790 
Format: PSUBSH rd, rs, rt 
Purpose: To subtract 8 pairs of 16-bit signed integers with saturation in parallel. 
Description: rd<rs—rt 


The eight signed halfwords in GPR rt are subtracted from the corresponding eight signed 
halfwords in GPR rs in parallel. The results are placed into the corresponding eight 
halfwords in GPR rad. 


No overflow or underflow exceptions are generated under any circumstances. Results 
beyond the range of a signed halfword value are saturated according to the following: 


Overflow: Ox7F FF 
Underflow: 0x8000 


This instruction operates on 128-bit registers. 


Operation: 

if ((GPR[rs]1s..0 — GPR[rtJ1s..0) >Ox7FFF) then 
GPRIrd]1s..0 < Ox7F FF 

else if (0x10000 <=(GPR[rs]is..0— GPR[rt]1s..0) <Ox18000) then 
GPR[rd]is..o < 0x8000 

else 
GPR[rd]is..o < (GPR[rs]15..0— GPR[rt]15..0)15..0 

endif 

if ((GPR[rs]s1..16 — GPR[rt]s1..16) >Ox7F FF) then 
GPRIrd]Bs1..16 < OX7F FF 

else if (0x10000 <=(GPR[rs]s1..16 — GPR[rt]31..16) <Ox18000) then 
GPR[rq]s1..16 < 0x8000 

else 
GPR[rd]s1..16 < (GPRIrs]s1..16 - GPR[rt]s1..16)15..0 

endif 

if ((GPR[rs]47..32 — GPR[rt]a7..32) >Ox7FFF) then 
GPRIrd]a7..32 < OXx7F FF 

else if (0x10000 <=(GPR[rs]a7..32 — GPR[rt]47..32) <Ox18000) then 
GPR [rd]az..32 < 0x8000 

else 
GPR [rd]az..32 < (GPR[rs]a7..32 — GPR[rt]a7..32)15..0 

endif 


if (GPR[rsle3..48 — GPR[rt]e3..48) >Ox7F FF) then 
GPRIrd]e3..48 < OXx7F FF 
else if (0x10000 <=(GPRIrslJe3..48 — GPR[rt]e3..48) <Ox18000) then 
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GPRIrd]es..48 <+ 0x8000 
else 
GPR[rd]e3..48 < (GPRIrs]e3.48 — GPR[rt]e3..48)15..0 
endif 
if (GPR[rs]79..64 — GPR[rt]79..64) >Ox7F FF) then 
GPR[rd]79..64 < Ox7F FF 
else if (0x10000 <=(GPRIrs]79..64 — GPR[rt]79..64) <Ox18000) then 
GPR[rd]v9..64 <+ 0x8000 
else 
GPR[rd]79..64 < (GPRIrs]79..64 — GPR[rt]79..64)15..0 
endif 
if ((GPR[rs]os..30 — GPR[rt]9s..80) >Ox7F FF) then 
GPR[rd]os..80 < Ox7F FF 
else if (0x10000 <=(GPRIrsl]os..80 —- GPR[rt]os..80) <Ox18000) then 
GPRI[rdJos..s0 <— 0x8000 
else 
GPR[rd]os..80 < (GPRIrs]os..80 — GPR[rt]os..80)15..0 
endif 
if ((GPR[rs]111..96 — GPR[rt]111..96) >Ox7F FF) then 
GPRIrd]111..96 < Ox7F FF 
else if (0x10000 <=(GPRIrs]111..96 — GPR[rt]111..96) <0x18000) then 
GPR[rd]111..96 <+ 0x8000 
else 
GPR[rd]111..96 < (GPRIrs]i11.96 — GPR[rt]111..96)15..0 
endif 
if ((GPR[rs]127..112 — GPR[rt]127.112) >Ox7F FF) then 
GPR [rd ]127..112 < Ox7F FF 
else if (Ox10000 <=(GPR[rs]127..112 — GPR[rt]127..112) <0x18000) then 
GPR[rd]127..112 < 0x8000 
else 
GPR[rd]127..112 © (GPR[rs]27.112 — GPR[rt]127..112)15..0 
endif 
127 112 111 96 95 80 79 64 63 48 47 32 31 16 15 0 
127 112 111 96 95 80 79 64 63 48 47 32 31 16 15 0 
fel=le=[e*|es,els | a 
127 112 111 96 95 80 79 64 63 48 47 32 31 16 15 0 


rd A7-B7 A6-B6 | A5-B5 | A4—-B4 | A3-B3 A2-B2 Ai-B1 | A0-BO 


* Saturate to signed halfword 


Exceptions: 


None 
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oad BSW Parallel Subtract with Signed Saturation Word Pe BSW 


26 25 21 20 16 15 11 10 
ote [=e a a 
= dee ele 
C790 
Format: PSUBSW rd, rs, rt 
Purpose: To subtract 4 pairs of 32-bit signed integers with saturation in parallel. 
Description: rd <rs-—rt 


The four signed words in GPR rt are subtracted from the corresponding four signed words 
in GPR rsin parallel. The results are placed into the corresponding four words in GPR rd. 


No overflow or underflow exceptions are generated under any circumstances. Results 
beyond the range of a signed word value are saturated according to the following: 


Overflow: Ox7FFFFFFF 
Underflow: 0x80000000 


This instruction operates on 128-bit registers. 


Operation: 

if ((GPR[rs]31..0 - GPR[rt]31..0) >Ox7F FFF FFF) then 
GPR[rd]31..0 < Ox7FFFFFFF 

else if (Ox100000000 <=(GPRIrsJs1..0 - GPR[rt]31..0) <O0x180000000) then 
GPR[rd]s1..0 < 0x80000000 

else 
GPRIrd]Bs1..0 < (GPRIrs]31..0-— GPR[rt]s1..0)31..0 

endif 

if (GPR[rs]e3..32 — GPR[rt]e3..32) >Ox7F FFFFFF) then 
GPRIrd]e3..32 < Ox7FFFFFFF 

else if (Ox100000000 <=(GPRIrs]Je3..32 - GPR[rt]e3..32) <Ox180000000) then 
GPR [rd]Jes..32 < 0x80000000 

else 
GPRIrd]e3..32 < (GPRIrs]e3.32 — GPR[rt]e3..32)31..0 

endif 

if ((GPR[rs]os..64 — GPR[rt]95..64) >Ox7F FFFFFF) then 
GPR[rd]os..64 < Ox7FFFFFFF 

else if (Ox100000000 <=(GPRIrsJos..64 - GP R[rt]os..64) <Ox180000000) then 
GPR[rd]os..64 < 0x80000000 

else 
GPRIrdJos..64 < (GPRIrs]os..64 — GPR[rt]os..64)31..0 

endif 
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if ((GPR[rs]127..06 — GPR[rt]127.06) >OxX7F FFFFFF) then 


GPR[rd]127..96 < Ox7FFFFFFF 
else if (0x100000000 <=(GPRI[rs]127..96 — GPR[rt]127..96) <Ox180000000) then 
GPR[rd]127..96 <+ 0x80000000 
else 
GPR[rd]127..96 < (GPR[rs]127..96 — GPR[rt]127..96)31..0 
endif 
127 96 95 64 63 32 31 0 
127 - 96 95 7 64 63 5 32 31 a 0 


127 96 95 64 63 32 31 0 


rd A3-B3 A2-B2 Ai-B1 AO-BO 


* Saturate to signed word 
Exceptions: 


None 
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Ren B U B Parallel Subtract with Unsigned Saturation Byte aoe B U B 


26 25 21 20 16 15 11 10 
cote |= Pe = [ar 
a = nee 
C790 
Format: PSUBUB rd, rs, rt 
Purpose: To subtract 16 pairs of 8-bit unsigned integers with saturation in parallel. 
Description: rd <rs-—rt 


The sixteen unsigned bytes in GPR rt are subtracted from the corresponding sixteen 
unsigned bytes in GPR rsin parallel. The results are placed into the corresponding sixteen 
bytes in GPR rd. 


No underflow exceptions are generated under any circumstances. Results beyond the 
range of an unsigned byte value are saturated according to the following: 


Underflow: 0x00 


This instruction operates on 128-bit registers. 


Operation: 


if (GPR[rs]z..0 — GPR[rt]7..0) <0x00) then 
GPR[rd]7..0 < Ox00 
else 
GPRI[rd]v.0 <— (GPRIrs]7..0— GPR[rt]7..0)7..0 
endif 


if (GPR[rs]is.s — GPR[rt]1s..3) <Ox00) then 
GPR[rd]1s..s<— 0x00 


else 
GPRIrd]is..s — (GPR[Irs]1s..s - GPR[rt]1s..8)7..0 
endif 
if (GPR[rs]23..16 — GPR[rt]23..16) <Ox00) then 
GPR[rd]23..16 < 0x00 
else 
GPR[rd]z3..16 © (GPR[rs]23..16 — GPR[rt]23..16)7..0 
endif 
if ((GPR[rs]31..24 — GPR[rt]31..24) <Ox00) then 
GPRIrd]Bs1..24 < 0x00 
else 
GPR[rd]s1..24 < (GPR[rs]s1.24 — GPR[rt]31..24)7..0 
endif 
if ((GPR[rs]39..32 — GPR[rt]39..32) <Ox00) then 
GPR[rd]s9..32 < 0x00 
else 
GPR [rd]so..32 < (GPRIrs]s39..32 — GPR[rt]39..32)7..0 
endif 
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if ((GPR[rsl]az..40 — GPR[rt]a7..40) <Ox00) then 


GPRIrd]a7..40 < 0x00 
else 
GPR[rd]a7..40 < (GPRIrs]az7..40 — GPR[rt]47..40)7..0 
endif 
if (GPR[rs]ss..48 — GPR[rt]ss..48) <Ox00) then 
GPRIrd]ss..48 < 0x00 
else 
GPRIrd]ss..48 < (GPRIrs]ss.48 — GPR[rt]ss..48)7..0 
endif 
if ((GPR[rsle3..56 — GPR[rt]e3..56) <Ox00) then 
GPRIrd]e3..56 < 0x00 
else 
GPR [rd]e3..56 < (GPRIrs]e3.56— GPR[rt]e3..56)7..0 
endif 
if ((GPR[rs]71..64 - GP R[rt]71..64) <Ox00) then 
GPRIrd]71..64 < 0x00 
else 
GPR[rd]71..64 < (GPRIrs]71..64 -— GPR[rt]71..64)7..0 
endif 
if ((GPR[rs]79.72 - GPR[rt]79..72) <Ox00) then 
GPR[rd]v9..72 < 0x00 
else 
GPR[rd]9..72 < (GPRIrs]v9..72 — GPR[rt]79..72)7..0 
endif 
if ((GPR[rs]s7..30 — GPR[rt]s7..30) <Ox00) then 
GPR[rd]s7..so < 0x00 
else 
GPRI[rd]s7..so < (GPRIrs]s7..30 - GPR[rt]s7..80)7..0 
endif 
if ((GPR[rs]os..s8 — GPR[rt]os..88) <Ox00) then 
GPR[rdJos..ss < 0x00 
else 
GPR[rdJos..88 < (GPRIrs]os..38 — GPR[rt]os..s8)7..0 
endif 
if ((GPR[rs]103..06 — GPR[rt]103..96) <Ox00) then 
GPR[rd]103..96 < 0x00 
else 
GPR[rd]103..96 < (GPR[rs]103..96 - GPR[rt]103..96)7..0 
endif 
if ((GPR[rs]111..104 — GPR[rt]111..104) <Ox00) then 
GPR[rd]111..104 < 0x00 
else 
GPR[rd]111..104 < (GPR[rs]111..104 — GPR[rt]111..104)7..0 
endif 
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if ((GPR[rs]119..112 — GPR[rt]119..112) <Ox00) then 


GPR[rd]119..112 < 0x00 
else 
GPR[rd]119..112 < (GPR[rs]119..112 — GPR[rt]119..112)7..0 
endif 
if ((GPR[rs]127..120 — GPR[rt]127..120) <Ox00) then 
GPR[rd]127..120 < 0x00 
else 
GPR[rd]127..120 < (GPR[rs]127..120 —- GPR [rt]127..120)7..0 
endif 


127_ 120119 112 111 104 103 96 95 88 87 80 79 72 71 64 63 56 55 48 47 40 39 32 31 24 23 1615 8/7 0 
cfaif ae af ofan foo] ee [Te] [s [ms 2] [A 


127 120119 112 111 104 103 96 95 88 87 80 79 72 71 64 63 56 55 48 47 40 39 32 31 24 23 1615 87 0 
evs] sre[sva] sralsrs| so] s0| se [er | os [as |e |mo| molar 


127_ 120119 112 111 104 103 96 95 88 87 80 79 72 71 64 63 56 55 48 47 40 39 32 31 24 23 1615 87 0 


rd A15 Al4 A13 Al2 Ail A10 AQ A8 AT A6 AS A4 
B15 B14 B13 B12 B11 B10 B9 B8& B7 B6 B5 B4 


* Saturate to unsigned byte 


Exceptions: 


None 
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aad B U H Parallel Subtract with Unsigned Saturation Halfword ae B U H 


26 25 21 20 16 15 11 10 
ote [se a 
= a — 
C790 
Format: PSUBUH rd, rs, rt 
Purpose: To subtract 8 pairs of 16-bit unsigned integers with saturation in parallel. 
Description: rd<rs—rt 


The eight unsigned halfwords in GPR rt are subtracted from the corresponding eight 
unsigned halfwords in GPR rs in parallel. The results are placed into the corresponding 
eight halfwords in GPR rd. 


No underflow exceptions are generated under any circumstances. Results beyond the 
range of an unsigned halfword value are saturated according to the following: 


Underflow: 0x0000 


This instruction operates on 128-bit registers. 


Operation: 
if ((GPR[rs]1s..0 — GPR[rt]1s..0) <Ox0000) then 
GPR[rd]Jis..o <— 0x0000 
else 
GPR[rd]is..o < (GPRIrs]is..o— GPR[rt]1s..0)15..0 
endif 
if ((GPR[rs]31..16 — GPR[rt]31..16) <Ox0000) then 
GPR[rd]s1..16 <— 0x0000 
else 
GPR[rd]s1..16 < (GPRIrs]s1..16 —- GPR[rt]s1..16)15..0 
endif 
if ((GPR[rs]a7..32 — GPR[rt]47..32) <Ox0000) then 
GPR[rd]az..32 < 0x0000 
else 
GPR[rd]az..32 < (GPR[rs]a7..32 — GPR[rt]a7..32)15..0 
endif 
if ((GPR[rsle3..48 — GPR[rt]e3..48) <Ox0000) then 
GPR[rd]e3..48 <— 0x0000 
else 
GPR[rd]e3..48 < (GPRIrs]e3..48 — GPR[rt]e3..48)15..0 
endif 
if ((GPR[rs]79..644 — GPR[rt]79..64) <Ox0000) then 
GPR[rd]7o..64 < 0x0000 
else 
GPR[rd]vs..64 < (GPRI[rs]79..64 — GPR[rt]79..64)15..0 
endif 
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if ((GPR[rs]os..s0 — GPR[rt]os..80) <Ox0000) then 


GPRI[rdJos..80 < 0x0000 
else 
GPR[rd]os..80 < (GPRIrs]os..80 —- GPR[rt]os..80)15..0 
endif 
if ((GPR[rs]111..96 — GPR[rt]111..96) <Ox0000) then 
GPR[rd]111..96 <— 0x0000 
else 
GPR[rd]111..96 < (GPRIrs]i11.96 — GPR[rt]111..96)15..0 
endif 
if ((GPR[rs]127..112 — GPR[rt]127..112) <Ox0000) then 
GPR[rd]127..112 < 0x0000 
else 
GPR[rd]127..112 < (GPR[Irs]127.112 — GPR[rt]127..112)15..0 
endif 
127 112 111 96 95 80 79 64 63 48 47 32 31 16 15 0 
127 112 111 96 95 80 79 64 63 48 47 32 31 16 15 0 
nr | e7 | es | es | oe | es | ee | er | eo | 
127 112 111 96 95 80 79 64 63 48 47 32 31 16 15 0 


rd A7-B7 A6-B6 | A5-B5 | A4—-B4 | A3-B3 A2-B2 Ai-B1 | A0-BO 


* Saturate to unsigned halfword 


Exceptions: 


None 
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Ren B UW Parallel Subtract with Unsigned Saturation Word oe B UW 


26 25 21 20 16 15 11 10 
ote = Pe = Pr 
a = sel 
C790 
Format: PSUBUW td, rs, rt 
Purpose: To subtract 4 pairs of 32-bit unsigned integers with saturation in parallel. 
Description: rd<rs-—rt 


The four unsigned words in GPR rt are subtracted from the corresponding four unsigned 
words in GPR rs in parallel. The results are placed into the corresponding four words in 
GPR rd. 


No underflow exceptions are generated under any circumstances. Results beyond the 
range of an unsigned word value are saturated according to the following: 


Underflow: 0x00000000 


This instruction operates on 128-bit registers. 


Operation: 
if ((GPR[rs]31..0 — GPR[rt]s1..0) <Ox00000000) then 
GPR[rd]a1..0 <— 0x00000000 
else 
GPR[rd]a1..0 < (GPRIrs]s1..0— GPR[rt]s1..0)31..0 
endif 
if ((GPR[rs]e3..32 — GPR[rt]e3..32) <Ox00000000) then 
GPR[rd]e3..32 <— 0x00000000 
else 
GPR [rd]es..32 < (GPRIrs]e3..32 —- GPR[rt]e3..32)31..0 
endif 
if ((GPR[rs]9s..64 — GPR[rt]9s..64) <Ox00000000) then 
GPR[rd]Jos..64 <— 0x00000000 
else 
GPRIrdJos..64 < (GPRIrs]os..64 —- GPR[rt]os..64)31..0 
endif 
if ((GPR[rs]127..06 —- GPR[rt]127..96) <Ox00000000) then 
GPR[rd]127..96 <— 0x00000000 
else 
GPR [rd]127..96 < (GPRIrs]127..96 — GPR[rt]127..96)31..0 
endif 
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127 96 95 64 63 
127 96 95 64 63 
127 96 95 64 63 


rd A3-B3 A2-B2 Ai-B1 


* Saturate to Unsigned word 


Exceptions: 


None 
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32 31 
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TX 
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peal 


MMI PSUBW MMIO 
011 ae wid = 000 


Format: PSUBW rd, rs, rt 
Purpose: To subtract 4 pairs of 32-bit integers in parallel. 
Description: rd<rs—rt 


C790 


The four signed words in GPR rt are subtracted from the corresponding four words in GPR 
rsin parallel. The results are placed into the corresponding four words in GPR rd. 


No overflow or underflow exceptions are generated under any circumstances. 


This instruction operates on 128-bit registers. 


Operation: 


GPRIrd]s1..0 

GPR[Ird]e3.32 <— 
GPRIrd]os.64 <— 
GPR[rd]127..96 <— 


< (GPRIrs]s1..0 — GPR[rt]s1..0)31..0 
GPRIrs]e3..32 —- GPR[rt]e3..32)31..0 
GPRIrs]os..64 - GPR[rt]os..64)31..0 
GPRI[rs]127..06 —- GPR[rt]127..96)31..0 


man ne 


127 96 95 64 63 
127 96 95 64 63 
127 96 95 64 63 


rd A3-B3 A2-B2 Ai-B1 


Exceptions: 


None 
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PXOR Parallel Exclusive OR PXOR 


31 26 25 21 20 16 15 11 10 6 5 0 
MMI PXOR MMI2 
r r r 
6 5 5 5 5 6 
C790 
Format: PXOR rd, rs, rt 
Purpose: To do a bitwise logical EXCLUSIVE OR. 
Description: rd < rs XOR rt 


The contents of GPR rs are combined with the contents of GPR rt in a bitwise logical 
exclusive OR operation. The result is placed into GPR rd. 


This instruction operates on 128-bit registers. 


Operation: 
GPRIrdhi27.0 < GPR[rs]i27..0 xor GPR[rt]127..0 


127 64 63 0 
XOR XOR 
127 64 63 0 


mn 
nm 
NI 
op) 
KR 
fon) 
oo 
ro) 


rd A1 XOR B1 AO xoR BO 


Exceptions: 


None 
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ers RV Quadword Funnel Shift Right Variable ei RV 


26 25 21 20 16 15 11 10 
ote Pe Ds aa 
mee kal Salle 
C790 
Format: QFSRV rd, rs, rt 
Purpose: To right shift a quadword by a variable number of bits. 
Description: rd < (rs, rt) >> SA 


The content of GPR rt is concatenated with the content of GPR rs producing the 
intermediate result rs:rt. This value is shifted right by the number of bits specified in the 
shift amount register SA. The least significant 16 bytes (i.e. quadword) of the shifted 
result is placed into GPR rd. 


Restriction: 


Note that SA can be loaded only with byte shift values (MTSAB) or halfword shift values 
(MTSAH); i.e. with bit shift amounts that are multiples of 8 or 16. 


This instruction operates on 128-bit registers. 


Operation: 
if (SA =0) then 
GPR[rd]127..0 < GPR[rt]127..0 
else 
GPRIrd]i27..0 < GPRIrsksa_y)..0 || GPR[rt]127..sa 
endif 


Programming Note: 


1. A left funnel shift by an amount of s bytes can be done by setting SA to 16-s using 
the MTSAB instruction, provided that sis not 0. Similarly, a left funnel shift by s 
halfwords can be done by setting SA to 8-s using the MTSAH instruction, provided 
that sis not 0. A quick way to perform this computation is as follows: 

// Register %sal contains the left shift amount 
subi %samt, %sal, 1 
mtsab%samt, -1 


// Following QF SRV does a shift left by %sal bytes 
qfsrv “dst, %srcl, %src2 


2. QFSRV can be used to rotate a 128-bit quantity r by setting both source operands 
rs and rt to register r. For example, the following code sequence rotates right the 
value in wide register %5 by 3 halfwords(i.e. 48 bits), and deposits the result in 
wide register %6. 

mtsah %0, 3 
qfsrv  %6, %5, %5 
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a Store Quadword SQ 


26 25 21 20 16 15 0 
SQ 
011 14 base offset 
16 
C790 

Format: SQ rt, offset (base) 

Purpose: To store a quadword to memory. 

Description: memory [base + offset] < rt 


The 128-bit quadword in GPR rt is stored in memory at the location specified by the 
effective address. The 16-bit signed offset is added to the contents of GPR base to form the 
effective address. The least significant four bits of the effective address are masked to zero 
(effectively creating an aligned address) before being used to access memory. No address 
exceptions due to alignment are possible. 


Restrictions: 
The effective address doesn’t have to be naturally aligned. The least significant 4 bits of 
the effective address are ignored. 


Operation: 


vAddr < sign_extend (offset) +GPR[base]s1..o 

vAdadr3..0 =0* 

(pAddr, uncached) « AddressTranslation (vAddr, DATA, STORE) 
quadword < GPR[rt]127..0 

StoreMemory (uncached, QUADWORD, quadword, pAddr, vAddr, DATA) 


Exceptions: 


TLB Refill 
TLB Invalid 
Address Error 


Programming Notes: 


None 
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B.5 C790-Specific Instruction Encoding 


31 26 0 


OpCode 
OpCode | bits 28..26 Instructions encoded by OpCode field (MMI, LQ, SQ) 


bits 0 1 2 3 4 5 6 7 
31..29 000 001 010 011 100 101 110 111 


2 

3 011 
4 100 
5 

6 

7 


Pap [oxoow | toc | or | mm | = | 10 | so 
LB LH LWL LW LBU LHU LWR LWU 


101 
110 
111 


bits 2..0 Instructions encoded by function field when OpCode field = MMI 


bits 0 1 2 3 4 5 6 7 
001 010 011 100 101 110 111 


000 
CT A CC 
5 
6 
7 


vor | nana 5 | eos 5 | | 
110 
m1 
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31 26 10 65 0 


OpCode = . 
MMI [function | MMIO 


Instructions encoded by function field when OpCode field = MMI & bit 5..0 = MMIO 


31 26 10 65 0 
OpCode = 


Instructions encoded by function field when OpCode field = MMI & bit 5..0 = MMI1 
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31 26 10 65 0 


OpCode = . 
MMI [function | MMI2 
bits 7..6 Instructions encoded by function field when OpCode field = MMI & bit 5..0 = MMI2 


bits 0 1 2 3 
11 


00 01 10 
6 
7 


31 26 10 65 0 
OpCode = 
bits 7..6 Instructions encoded by function field when OpCode field = MMI & bit 5..0 = MMI3 


. This OpCode is reserved for future use. An attempt to execute it causes a 
Reserved Instruction exception. 


3) This OpCode indicates an instruction class. The instruction word must be 
further decoded by examining additional tables that show the values for 
another instruction fields. 

n This OpCode is reserved for one of the following instructions which are 
currently not supported: DMULT, DMULTU, DDIV, DDIVU, LL, LLD, SC, 
SCD, LWC2, SWC2. An attempt to execute it causes a Reserved Instruction 
exception. 
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C. COPO System Control 
Coprocessor Instruction Set Details 


This appendix provides a detailed description of the operation of each System Control 
Coprocessor (COP0) instruction. 


COPO instructions perform operations specifically on the System Control Coprocessor 
registers to manipulate the memory management and exception handing facilities of the 
processor. 


COPO Coprocessor instructions are enabled if the processor is in Kernel mode, or if bit 28 
(CU[O]) is set in the Status register. Otherwise, executing one of these instructions 
generates a Coprocessor Unusable exception. The only exception to this rule are the E| 
and the DI instructions which never generate Coprocessor Unusable exceptions. 


When the EDI bit in the Status register is set, the El and DI instructions operate in User, 
Supervisor, and Kernel modes independent of whether COPO coprocessor usable bit 
(Status.CU[O]) is set or not. When the EDI bit is cleared El and DI work as NOPs in User 
and Supervisor modes independent of whether COPO coprocessor usable bit (Status.CU/0)) 
is set or not, and executes properly in Kernel mode. 
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BCOF Branch on Coprocessor 0 False BCOF 


31 26 25 21 20 16 15 0 
COPO BCO BCOF est 
010000 01000 00000 
6 5 5 16 
MIPS | 
Format: BCOF offset 
Description: 


A branch target address is computed from the sum of the address of the instruction in the 
delay slot and 16-bit offset, shifted left two bits and sign-extended. If coprocessor 0’s 
condition signal, as sampled during the previous instruction, is false, then the program 
branches to the target address with a delay of one instruction. 


Restrictions: 


Because the coprocessor 0 condition is externally supplied, there is no way to synchronize 
the change/update of the condition and the execution of this instruction. 


Operation: 
I: tgt_offset < sign_extend (offset || 02) 
condition < not CPCONDO 
1+1: if condition then 
PC < PC +tgt_offset 
endif 
Exceptions: 


Coprocessor Unusable exception 
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BCOFL Branch on Coprocessor 0 False Likely BCOFL 


31 26 25 21 20 16 15 0 
COPO BCO BCOFL bot 
010000 01000 00010 
6 5 5 16 
MIPS Il 
Format: BCOFL offset 
Description: 


A branch target address is computed from the sum of the address of the instruction in the 

delay slot and the 16-bit offset, shifted left two bits and sign-extended. If the contents of 

coprocessor 0’s condition signal, as sampled during the previous instruction, is false, the 

program branches to the target address with a delay of one instruction. 

If the conditional branch is not taken, the instruction in the branch delay slot is nullified. 
Restrictions: 


Because the coprocessor 0 condition is externally supplied, there is no way to synchronize 
the change/update of the condition and the execution of this instruction. 


Operation: 
I: tgt_offset < sign_extend (offset || 02) 
condition <— not CPCONDO 


I+1: if condition then 
PC < PC +tgt_offset 
endif 


Exceptions: 


Coprocessor Unusable exception 
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BCOT Branch on Coprocessor 0 True BCOT 


31 26 25 21 20 16 15 0 
COPO BCO BCOT ine 
010000 01000 00001 
6 5 5 16 
MIPS I 
Format: BCOT offset 
Description: 


A branch target address is computed from the sum of the address of the instruction in the 
delay slot and the 16-bit offset, shifted left two bits and sign-extended. If the coprocessor 
0'z condition signal is true, then the program branches to the target address, with a delay 
of one instruction. 


Restrictions: 


Because the coprocessor 0 condition is externally supplied, there is no way to synchronize 
the change/update of the condition and the execution of this instruction. 


Operation: 
I: tgt_offset < sign_extend (offset || 0) 
condition < not CPCONDO 
1+1: if condition then 
PC < PC +tgt_offset 
endif 
Exceptions: 


Coprocessor Unusable exception 
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BCOTL Branch on Coprocessor 0 True Likely BCOTL 
31 26 25 21 20 16 15 


0 
COPO BCO BCOTL BiESE 
010000 01000 00011 
6 5 5 


16 


MIPS II 
Format: BCOTL offset 


Description: 


A branch target address is computed from the sum of the address of the instruction in the 
delay slot and the 16-bit offset, shifted left two bits and sign-extended. If the contents of 
coprocessor O's condition signal, as sampled during the previous instruction, is true, the 
program branches to target address with a delay of one instruction. 


If the conditional branch is not taken, the instruction in the branch delay slot is nullified. 


Restrictions: 


Because the coprocessor 0 condition is externally supplied, there is no way to synchronize 
the change/update of the condition and the execution of this instruction. 
Operation: 


I: tgt_offset < sign_extend (offset || 02) 
condition <— not CPCONDO 


[+1: if condition then 
PC < PC +tgt_offset 
else 
NullifyCurrentl nstruction() 
endif 


Exceptions: 


Coprocessor Unusable exception 
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CACHE Cache CACHE 
31 26 25 21 20 1615 0 
CACHE base op offset 
101111 (See table) 
6 5 5 16 
R4000 
Format: CACHE op, offset (base) 
Description: 


The 16-bit offset is sign-extended and added to the contents of general register base to 
form a virtual address (VA). The VA is translated to a physical address (PA) through the 
memory management unit and its TLB, and the 5-bit OpCode (decode in the table below) 
specifies a cache operation for that address, together with the affected cache. Operation of 
this instruction on any combination not listed in the table below is undefined. The 
operation of this instruction on uncached and uncached accelerated addresses is also 
undefined unless it is index-type sub-operation. 


Table C-1. CACHE Instruction Op Field Encoding 


| Mnemonic | OpCode | CACHE Instruction sss s| Target 
IXLTG 00000 INDEX LOAD TAG Instruction Cache 
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Operation: 
vAddr< _(offsetis)!6 || offsetis..o +GPR[base] 31..0 
(pAddr, uncached) < AddressTranslation (vAddr, DATA) 
CacheOp (op, vAddr, pAddr) 

Exceptions: 


Coprocessor U nusable exception 
TLB Refill 

TLB Invalid 

Address Error 


C.1.1. Notes on the CACHE Instruction Sub-operations 


Cache Virtual Address 
The CACHE instruction uses the following portions of the Virtual Address (VA) computed 
by adding the offset to the base to specify a cache block and way: 


e VA[13:6] defines a 64-byte line in the data cache array 
VA[13:6] defines a 64-byte line in the instruction cache array 
e In both cases, VA[O] defines the way needed by | ndex sub-operations 


When accessing data in the caches, VA[13:2] is used to read or write a specific data word 
in the data cache and VA[13:2] is use to read or write a specific instruction in the 
instruction cache. 


Cache Physical Address 
The CACHE instruction computes the Physical Address (PA) to access memory for cache 
Hit Invalidate (I) and Fill (1) sub-operations in the following manner: 


e VA[31:6] is computed from the CACHE instruction by adding the offset to the 
base and then the result is translated to produce PA[31:6] 


The CACHE instruction computes the Physical Address (PA) to access memory for cache 
Hit Invalidate (D), Hit Writeback I nvalidate (D), Hit Writeback Without I nvalidate (D) 
sub-operations in the following manner: 


e VA[31:6] is computed from the CACHE instruction by adding the offset to the 
base and then the result is translated to produce PA[31:6] 


BTAC Virtual Address 


The CACHE instruction uses the following portions of the Virtual Address (VA) computed 
by adding the offset to the base to check if thereis an entry that matches the VA: 


e VA[31:3] defines an entry in the BTAC 
BTAC Index Bits 


Since the BTAC is has 64 entries the VA[5:0] computed from the CACHE instruction by 
adding the offset to the base is used to index the BTAC. 


COPO Not Usable 


If COPO is not usable (if not in Kernel mode, Status.CU0O must be set for COPO to be 
usable), a Coprocessor unusable exception is taken. 
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TLB Exceptions on Cache Operations 
TLB Refill and TLB Invalid exceptions can occur only for the following sub-operations: 
1. Hit Invalidate (I) 
2. Fill (I) 
3. Hit Invalidate (D) 
4. Hit Writeback I nvalidate (D) 
5. Hit Writeback without I nvalidate (D) 
The TLB Modified exception is never generated. 
Hit Sub-operation Accesses 


A Hit sub-operation accesses the specified cache as a normal data reference, and performs 
the specified operation if the cache line contains valid data at the specified physical 
address (a hit). The operation is undefined if a CACHE sub-operation hit occurs in both 
ways of the cache. 


Breakpoint Exception 


Breakpoint exceptions can not be generated by any of the CACHE sub-operations (note 
that an Instruction Address Breakpoint can still be done on the CACHE instruction itself). 


Address Error Exception 


None of the CACHE sub-operations will generate an Address Error exception due to 
misalignment of the VA created by the CACHE instruction as described above. The 
following CACHE sub-operations can generate privilege-type Address Error exceptions: 


1. Hit Invalidate (I) 

2. Fill (I) 

3. Hit Invalidate (D) 

4. Hit Writeback I nvalidate (D) 

5. Hit Writeback without Invalidate (D) 
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C.1.2 Sub-Operation Descriptions 


Note on Cache Enable Status 


All Instruction cache related suboperations perform their function regardless of the value 
of the /CE bit of the Config register. (i.e, regardless of whether the Instruction cache is 
enabled or not.) 


All data cache related suboperations perform their function regardless of the value of the 
DCE bit of the Config register. (i.e., regardless of whether the data cache is enabled or 
not.) 


All BTAC-related suboperations perform their function regardless of the value of the BPE 
bit of the Config register. 


Op = 00111 Index Invalidate (I) 


Index Invalidate (I) sets a line in the instruction cache to Invalid. VA[13:6] defines the 
index of the line and VA[O] defines the way to be invalidated. The LRF bit does not change. 


Op = 00000 Index Load Tag (I) 


Index Load Tag (I) reads the instruction cache tag array fields into the COPO TagLO 
register. VA[13:6] defines the index and VA[0] defines the way of the tag to be read. The 
following mapping defines the sub-operation: 

e TagLO[4] =LRF bit 

e TagLO[5] =VALID bit 

e TagLO[31:12] =Tag[19:0] 


All other TagLO bits are undefined. 
Op = 00100 Index Store Tag (I) 


Index Store Tag (I) stores the COPO TagLO register into the instruction cache tag array. 
VA[13:6] defines the index and VA[0] defines the way of the tag to be read. The following 
mapping defines the sub-operation: 

e LRFbit  =TagLO[4] 

e VALID bit =TagLO[5] 

e Tag[19:0] =TagLO[31:12] 


Note that it is perfectly feasible to invalidate the cache line using this sub-operation. 
Op = 01011 Hit Invalidate (I) 
Hit Invalidate (I) invalidates a line in the instruction cache which matches the PA[31:6] 


computed from the CACHE instruction. Both way tags at VA[13:6] are read from the 
instruction cache. 


If the Valid bit of one of the entries is a 1 and the PA of the CACHE instruction matches 
the Tag from that entry of the instruction cache tag array, the Valid bit of the entry is 
changed to a 0 (Invalid). The LRF bit does not change. This sub-operation also invalidates 
BTAC entries which match VA[31:6]. 
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Op = 01110 Fill (1) 


Fill (1) brings in a cache line from memory and stores it in the instruction cache. The 
following sequence is followed: 

1. The PA computed from the CACHE instruction is used to fetch the cache line from 
memory. 

2. The line is loaded into the cache line addressed by VA[13:6] and the way of cache is 
defined by the rules of the LRF bits. 

3. The corresponding instruction cache tag is loaded with the PFN and the entry is 
validated. 


Op = 00001 Index Load Data (I) 


Index Load Data (I) reads a single instruction from the instruction cache data array and 
stores it into the COPO TagLO and TagHI registers. VA[13:2] defines the index and VA[O] 
defines the way of the instruction cache to be read. The following mapping defines the sub- 
oper ation: 

e TagLO[31:0] =32-bit instruction 

e TagHl1[3:0] =SteeringBits[3:0] 

e = TagHI[5:4] =BHT[1:0] 
All other TagHI bits are undefined. 


Op = 00101 Index Store Data (I) 


Index Store Data (I) stores the COPO TagLO and TagHI registers into the instruction 
cache data array. 


VA[13:2] defines the index and VA[O] defines the way of the instruction cache to be 
written. The following mapping defines the sub-operation: 

e 32-bit instruction =TagLO[31:0] 

e SteeringBits[3:0] _=TagHI[3:0] 

e BHT[1:0] =TagH1[5:4] 
The BHT[1:0] bits are associated with the instruction pair at VA[13:3]. This sub-operation 
invalidates all BTAC entries. 


Op = 00010 Index Load BTAC (B) 


Index Load BTAC (B) reads a single BTAC entry and stores it into the COPO TagLO 
registers. VA[5:0] defines the index of the BTAC entry to be read. The following mapping 
defines the sub-operation: 

e TagLO[0] =Valid Bit 

e TagLO[31:3] =F etchAddress[28:0] 

e = TagH1[31:2] =TargetAddress[29:0] 


All other TagLO and TagHI bits are undefined. 
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Op = 00110 Index Store BTAC (B) 


Index Store BTAC (B) stores the COPO TagLO and TagHI registers into a single BTAC 
entry. VA[5:0] defines the index of the BTAC entry to be written. The following mapping 
defines the sub-operation: 

e Valid Bit =TagLO[0] 

e FetchAddress[28:0] =TagL O[31:3] 

e TargetAddress[29:0] =TagH1[31:2] 


Op = 01100 BTAC Flush (B) 


This sub-operation invalidates the complete BTAC by writing a 0 into the valid bits of all 
the entries of the BTAC. 


Op = 01010 Hit Invalidate BTAC (B) 


Hit Invalidate BTAC (B) invalidates an entry in the BTAC which matches the VA[31:3] 
computed from the CACHE instruction. If the VA[31:3] matches an entry in the BTAC and 
its Valid bit is equal to 1 then the Valid bit is changed toa 0. The result is undefined if 
there are plural of entries that matches the VA. 


Op = 10100 Index Writeback Invalidate (D) 


Index Writeback I nvalidate (D) sub-operation sets a cache line in the data cache to Invalid 
and writes back any dirty data to the CPU bus. VA[13:6] defines the index and VA[O0] 
defines the way of the data cache line to be invalidated. The invalidation takes place by 
writing a 0 tothe Valid bit. The LRF bit does not change. 


The PA where the cache line will be written to is calculated by appending VA[11:6] to the 
20-bit PFN field from the data cache tag to form PA[31:6]. This address represents a 
cache line address. 


Op = 10000 Index Load Tag (D) 


Index Load Tag (D) reads the data cache tag array fields into the COPO TagLO register. 
VA[13:6] defines the index and VA[O] defines the way of the tag to be read. The following 
mapping defines the sub-operation: 


TagL O[3] =Lock bit 
TagLO[4] =LRF bit 

TagL O[5] =Valid bit 

TagL O[6] =Dirty bit 

TagL O[31:12] =Tag[31:12] 


All other TagLO bits are undefined. 


Op = 10010 Index Store Tag (D) 


Index Store Tag (D) stores the COPO TagLO register into the data cache tag array. 
VA[13:6] defines the index and VA[0] defines the way of the tag to be written. The 
following mapping defines the sub-operation: 


Lock bit =TagLO[3] 

LRF bit =TagLO[4] 

Valid bit =TagLO[5] 

Dirty bit =TagLO[6] & TagLO[5] 
Tag[19:0] =TagLO[31:12] 
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Op = 10110 Index Invalidate (D) 


Index Invalidate (D) sets a linein the data cache to I nvalid. VA[13:6] defines the index of 
the line and VA[O] defines the way to be invalidated. The Lock bit, Dirty bit, and Valid bit 
are changed to zero. The LRF bit doesn’t change. 


Op = 11010 Hit Invalidate (D) 


Hit Invalidate (D) invalidates an entry in the data cache which matches the PA computed 
from the CACHE instruction. Both way tags at VA[13:6] are read from the data cache. 


If the Valid bit of the entry is one and the PA of the CACHE instruction matches the Tag 
from the data cache tag array, the Valid bit of the entry is changed to zero (Invalid). The 
Lock bit and Dirty bit are also changed to zero. The LRF bit does not change. 


Op = 11000 Hit Writeback Invalidate (D) 


Hit Writeback Invalidate (D) sub-operation invalidates an entry in the data cache which 
matches the PA computed from the CACHE instruction. Additionally it writes back any 
dirty data to the CPU bus. Both way tags at VA[13:6] are read from the data cache. The 
Lock bit, Dirty bit, and Valid bit are changed to zero. The LRF bits are not modified. 


If the PA computed from the CACHE instruction matches the tag from the data cache tag 
array and the Valid bit is 1 then the Valid bit is changed to 0. Further more if the Dirty 
bit is 1 then the cache line is written to the physical address calculated by appending 
VA[11:6] to the 20-bit PFN field from the data cache tag to form PA[31:6]. This address 
represents a cache line physical address. 


Op = 10001 Index Load Data (D) 


Index Load Data (D) reads a single word from the data cache data array and stores it into 
the COPO TagLO register. VA[13:2] defines the index and VA[O] defines the way of the 
data cache to be read. The following mapping defines the sub-operation: 

e TagLO[31:0] =32-bit data 


Op = 10011 Index Store Data (D) 


Index Store Data (D) stores the COPO TagLO register into the data cache data array. 
VA[13:2] defines the index and VA[O] defines the way of the data cache to be written. The 
following mapping defines the sub-operation: 


e 32-bit data =TagLO[31:0] 
Op = 11100 Hit Writeback Without Invalidate (D) 


Hit Writeback Without I nvalidate (D) sub-operation writes back any dirty data tothe 
CPU bus. Both way tags at VA[13:6] are read from the data cache. The Dirty bit is 
changed to zero. The LRF bits are not modified. 


If the PA computed from the CACHE instruction matches the tag from the data cache tag 
array and the Valid and Dirty bits are 1 then the cache line is written to the physical 
address calculated by appending VA[11:6] to the 20-bit PFN field from the data cache tag 
to form PA[31:6]. This address represents a cache line physical address. 
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Programming Notes: 


For all CACHE sub-operations which operate on the instruction cache the following 
programming restrictions have to be followed: 


1. A sequence of CACHE instructions has to be directly preceded and followed by a 
SYNC.P instruction. 


2. Each individual FILL sub-operation has to be followed by a SYNC.L instruction. 


For all CACHE sub-operations which operate on the data cache the following 
programming restrictions have to be followed: 


1. A sequence of CACHE instructions have to be directly preceded and followed by a 
SYNC.L instruction. 


2. Each of thethree WRITEBACK sub-operations have to be individually followed by a 
SYNC.L instruction. 


For all CACHE sub-operations which operate on the BTAC the following programming 
restrictions have to be followed: 


1. A sequence of CACHE instructions have to be directly preceded and followed by a 
SYNC.P instruction. 


C.1.3_ Updates of Data Tag Status Bits 


The following table summarizes the updates of Data Tag status bits for various Cache sub- 
operations. The values in the table for Hit Writeback Invalidate, Hit Writeback Without 
Invalidate, and Hit Invalidate only apply if there is a hit in the data cache. If there is no 
hit, the status bits are unchanged. 


Table C-2. Data Tag Status Bit Modifications 
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Di Disable Interrupt Di 


31 26 25 21 20 65 0 
COPO Co 0 DI 
010000 10000 000 0000 0000 0000 111001 
6 5 15 6 
C790 
Format: DI 
Description: 


DI instruction clears the E/E bit in the Status register and disable all interrupts (except 
NMI and SIO). When the E/E bit is cleared, all interrupts are disabled regardless of the 
value of /E bit in the Status register. 


When the EDI bit in the Status register is set, the DI instruction operates in User, 
Supervisor, and Kernel modes independent of whether COPO coprocessor usable bit 
(Status.CU[O]) is set or not. When this bit is cleared El and DI work as NOPs in User and 
Supervisor modes independent of whether COPO coprocessor usable bit (Status.CU[O)) is 
set or not, and executes properly in Kernel mode. 


Operation: 
If (Status.E DI =1) || (Status.E XL =1) || (Status.ERL =1) || (Status.KSU =002) then 
Status.EIE — 0 
endif 
Exceptions: 
None 
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E| Enable Interrupt EI 


31 26 25 21 20 65 0 
COPO CO 0 El 
010000 10000 000 0000 0000 0000 111000 
6 5 15 6 
C790 
Format: El 
Description: 


El instruction sets the E/E bit in the Status register. When the E/E bit is set, all 
interrupts are enabled if the /E bit in the Status register is 1, EXL bit is 0, and ERL bit is 
0. 


When the EDI bit in the Status register is set, the El instruction operates in User, 
Supervisor, and Kernel modes independent of whether COPO coprocessor usable bit 
(Status.CU[O]) is set or not. When this bit is cleared El and DI work as NOPs in User and 
Supervisor modes independent of whether COPO coprocessor usable bit (Status.CU[O)) is 
set or not, and executes properly in Kernel mode. 


Operation: 
If (Status.E DI =1) || (Status.E XL =1) || (Status.ERL =1) || (Status.KSU =002) then 
Status.EIE — 1 
endif 
Exceptions: 
None 
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E PR ET Exception Return E R ET 


31 26 25 21 20 65 0 
COPO CO 0 ERET 
010000 10000 000 0000 0000 0000 011000 
6 5 15 6 
R4000 
Format: ERET 
Description: 


ERET is the instruction for returning from an interrupt, exception, or error trap. Unlikea 
branch or jump instruction, ERET does not execute the next instruction. 


ERET must not itself be placed in a branch delay slot. 


If the processor is servicing a Level 2 exception, then load the PC from the ErrorEPC and 
clear the ERL bit of the Status register (bit 2 in Status register). Otherwise (ERL = 0), 
load the PC from the EPC, and clear the EXL bit of the Status register (bit 1 in Status 
register). 


Operation: 

if Status.ERL =1 then 
PC¢< ErrorEPC 
Status.ERL — 0 

else 
PCe EPC 
Status.EXL <— 0 

endif 


Exceptions: 
Coprocessor Unusable exception 
Implementation Note: 


ERET flushes the execution pipelines of the CPU before fetching the instruction from the 
target. Any pending loads, stores, ongoing multiplies, divides, multiply-accumulates and 
COP 1 instructions are not flushed. 


Programming Notes: 


Any Reserved Instruction must not be placed in a branch delay slot just after ERET 
instruction. Please pay careful attention if any instruction is placed in the branch delay 
slot, because the instruction in the branch delay slot may be executed incompletely before 
flushing. It is commended that NOP is placed in the branch delay slot. 
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oe PC Move from Breakpoint Control Register MFB PC 


26 25 21 20 1615 1110 32 0 


COPO MEO Debug MFBPC 
im anon 11000 ee es 000 


C790 
Format: MFBPC rt 


Description: 


The contents of the Breakpoint Control register of the COPO are loaded into general 
register rt. 


Operation: 


data < CPR[O, Breakpoint Control] 
GPR[rt] — (datas1)22 || datasz.o 


Exceptions: 


Coprocessor Unusable exception 
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M FCO Move from System Control Coprocessor M FCO 


31 26 25 21 20 16 15 11 10 0 
COPO MFO i ‘a 0 
010000 00000 000 0000 0000 
6 5 5 5 11 
R4000 
Format: MFCO rt, rd 
Description: 


The contents of coprocessor register rd of the COPO are loaded into general register rt. 


Operation: 


data — CPR[O, rd] 
GPR[rt] — (datas1)22 || datasi..o 


Exceptions: 


Coprocessor Unusable exception 
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MED AB Move from Data Address Breakpoint register MED AB 


31 26 25 21 20 1615 1110 32 0 
COPO MFO ‘ Debug 0 MFDAB 
010000 00000 11000 0000 0000 100 

6 5 5 5 8 3 


Format: MFDAB rt 


C790 


Description: 


The contents of Data Address Breakpoint register of the COPO are loaded into general 
register rt. 
Operation: 


data < CPR[O, Data Address Breakpoint] 
GPR[rt] — (datas1)*2 || datasz.o 


Exceptions: 
Coprocessor Unusable exception 
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MED ABM Move from Data Address Breakpoint Mask MED ABM 


Register 
31 26 25 2120 16 15 1110 32 0 


COPO MFO 4 Debug 0 MFDABM 
010000 00000 11000 0000 0000 101 
6 5 5 5 8 3 


Format: MFDABM rt 


C790 


Description: 


The contents of Data Address Breakpoint Mask register of the COPO are loaded into 
general register rt. 


Operation: 


data < CPR[O, Data Address Breakpoint Mask] 
GPR[rt] — (datas1)*2 || datasz.o 


Exceptions: 


Coprocessor Unusable exception 
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MFEDVB Move from Data value Breakpoint Register MFEDVB 


31 26 25 21 20 1615 1110 32 0 
COPO MFO rt Debug 0 MFDVB 
010000 00000 11000 0000 0000 110 

6 5 5 8 3 


Format: MFDVEB rt 


C790 


Description: 


The contents of Data Value Breakpoint register of the COPO are loaded into general 
register rt. 


Operation: 


data < CPR[O, Data Value Breakpoint] 
GPR[rt] — (datas1)=2 || datasz.o 


Exceptions: 


Coprocessor Unusable exception 
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MEDVBM Move from Data Value Breakpoint Mask MEDVBM 


Register 
31 26 25 21 20 16 15 11.10 32 


0 
COPO MFO ‘A Debug 0 MFDVBM 
010000 00000 11000 0000 0000 111 
6 5 5 5 8 3 


Format: MFDVBM rt 


C790 


Description: 


The contents of Data Value Breakpoint Mask register of the COPO are loaded into general 
register rt. 


Operation: 


data < CPR[O, Data Value Breakpoint Mask] 
GPR[rt] — (datas1)*2 || datasz..o 


Exceptions: 


Coprocessor Unusable exception 
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MFI AB Move from Instruction Address Breakpoint MFI AB 


Register 
26 25 2120 16 15 11.10 32 0 


31 

COPO MFO ‘ Debug 0 MFIAB 

010000 00000 11000 0000 0000 010 
6 5 5 5 8 3 


C790 
Format: MFIAB rt 


Description: 


The contents of /nstruction Address Breakpoint register of the COPO are loaded into 
general register rt. 


Operation: 


data < CPR[O, Instruction Address Breakpoint] 
GPR[rt] — (datas1)*2 || datasz.o 


Exceptions: 


Coprocessor Unusable exception 
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MFI ABM Move from Instruction Address Breakpoint MFI ABM 


Mask Register 
31 26 25 2120 1615 1110 32 


0 
COPO MFO iH Debug 0 MFIABM 
010000 00000 11000 0000 0000 011 
6 5 5 5 8 3 


Format: MFIABM rt 


C790 


Description: 


The contents of /nstruction Address Breakpoint Mask register of the COPO are loaded into 
general register rt. 


Operation: 


data < CPR[O, Instruction Address Breakpoint Mask] 
GPR[rt] — (datas1)*2 || datasz.o 


Exceptions: 


Coprocessor Unusable exception 
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MFPC Move from Performance Counter MFPC 


31 26 25 21 20 1615 1110 65 10 
COPO MFO ? Perf 0 

010000 00000 11001 00000 reg 

6 5 5 5 5 5 1 


Format: MFPC rt, reg 


C790 


Description: 


The contents of Performance Counter register of the COPO are loaded into general register 
rt. 


The reg OpCode bit indicates the number of Performance Counters. Only register 0 and 1 
are valid in the C790 implementation. 


Operation: 


data < CPR[O, Performance Counter (reg)] 
GPR[rt] < (datas1)?2 || datasz..o 


Exceptions: 


Coprocessor Unusable exception 
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me Move from Performance Event Specifier as 


26 25 21 20 16 15 11 10 
COPO MFO Perf 
010000 waa 11 ah co 


Format: MFPS rt, reg 


C790 


Description: 


The contents of Performance Control register of the COPO are loaded into general register 
rt. 


The reg OpCode bit indicates the number of Performance Counter Control registers. Only 
register 0 is valid in the C790 implementation. 


Operation: 


data < CPR[O, Performance Control (reg)] 
GPR[rt] — (datas1)22 || datasz.o 


Exceptions: 


Coprocessor Unusable exception 
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MTBPC Move to Breakpoint Control Register MTB PC 
31 26 25 2120 16 15 11 10 32 0 
COPO MTO rt Debug 0 MTBPC 
010000 | 00100 11000 00000000 ~—~+| 000 
6 5 5 5 8 3 


C790 
Format: MTBPC rt 


Description: 
The contents of general register rt are loaded into Breakpoint Control register of COPO. 


Operation: 


data < GPR[rt] 
CPR[O, Breakpoint Control] < data 


Programming Notes: 


All MTBPC instructions MUST be followed by a SYNC.P instruction as a barrier to 
guarantee COPO register update. 


Exceptions: 


Coprocessor Unusable exception 
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dag Move to System Control Coprocessor MTCO 


26 25 21 20 16 15 11 10 


0 
COPO 
01 a ot fi 000 om 0000 


Format: MTCO ft, rd 


R4000 


Description: 
The contents of general register rt are loaded into coprocessor register rd of COPO. 


Operation: 


data — GPR[rt] 
CPR[O, rd] < data 


Programming Notes: 


1. All MTCO instructions MUST be followed by a SYNC.P instruction as a barrier to 
guarantee COPO register update. There is one exception to this rule: 


a) An MTCO instruction which loads the EntryHi COPO register can be followed by 
a TLBWI or a TLBWR instruction without having an intervening SYNC.P 
instruction. This special case is handled by a hardware interlock. 


2. It is required that the MTCO instruction to EntryHi register MUST be executed either 
from unmapped space or from global mapped space (mapped space with a TLB entry 
which has the G bit set). Furthermore, the BTAC is flushed whenever the EntryHi 
register is updated. 


3. Modifying CONFIG.KO via a MTCO instruction should not occur from ksegO space. 


4. A SYNC.L instruction is needed before executing a MTCO instruction which modifies 
CONFIG.NBE or CONFIG.DCE. 


5. Updating the performance counter registers via a MTCO instruction while the 
performance counters are enabled will result in undefined counter values. 


Exceptions: 


Coprocessor Unusable exception 
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MTD AB Move to Data Address Breakpoint Register MTD AB 


31 26 25 21 20 1615 1110 32 0 
COPO MTO rt Debug 0 MTDAB 
010000 00100 11000 0000 0000 100 
6 5 5 5 8 3 


Format: MTDAB rt 


C790 


Description: 
The contents of general register rt are loaded into Data Address Breakpoint register of COPO. 


Operation: 


data < GPR[rt] 
CPR[O, Data Address Breakpoint] < data 


Programming Notes: 


All MTDAB instructions MUST be followed by a SYNC.P instruction as a barrier to 
guarantee COPO register update. 


Exceptions: 


Coprocessor Unusable exception 
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MT D AB M Move to Data Address Breakpoint Mask MTD AB M 


Register 
31 26 25 21 20 16 15 1110 32 


0 
COPO MTO “ Debug 0 MTDABM 
010000 00100 11000 0000 0000 101 
6 5 5 5 8 3 


Format MTDABM rt 


C790 


Description: 


The contents of general register rt are loaded into Data Address Breakpoint Mask register of 
COPO. 


Operation: 


data — GPR[rt] 
CPR[O, Data Address Breakpoint Mask] < data 


Programming Notes: 


All MTDABM instructions MUST be followed by a SYNC.P instruction as a barrier to 
guarantee COPO register update. 


Exceptions: 


Coprocessor Unusable exception 
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MTDVB Move to Data Value Breakpoint Register MTDVB 


31 26 25 21 20 1615 1110 32 0 
COPO MTO ‘ Debug 0 MTDVB 
010000 00100 11000 0000 0000 110 

6 5 5 5 8 3 


Format: MTDVB rt 


C790 


Description: 
The contents of general register rt are loaded into Data Value Breakpoint register of COPO. 


Operation: 


data <— GPR[rt] 
CPR[O, Data Value Breakpoint] < data 


Programming Notes: 


All MTDVB instructions MUST be followed by a SYNC.P instruction as a barrier to 
guarantee COPO register update. 


Exceptions: 


Coprocessor Unusable exception 
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MTDVBM Move to Data Value Breakpoint Mask MTDVBM 


Register 
31 26 25 2120 16 15 1110 32 0 


COPO MTO . Debug 0 MTDVBM 
010000 00100 11000 0000 0000 111 
6 5 5 5 8 3 


Format: MTDVBM rt 


C790 


Description: 


The contents of general register rt are loaded into Data Value Breakpoint Mask register of 
COPO. 


Operation: 


data — GPR[rt] 
CPR[O, Data Value Breakpoint Mask] < data 


Programming Notes: 


All MTDVBM instructions MUST be followed by a SYNC.P instruction as a barrier to 
guarantee COPO register update. 


Exceptions: 


Coprocessor Unusable exception 
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MTl AB Move to Instruction Address Breakpoint MTl AB 


Register 
31 26 25 21 20 16 15 1110 32 


0 

COPO MTO Debug 0 MTIAB 

010000 00100 11000 0000 0000 010 
6 5 5 5 8 3 


Format: MTIAB rt 


C790 


Description: 


The contents of general register rt are loaded into /nstruction Address Breakpoint register of 
COPO. 


Operation: 


data — GPR[rt] 
CPR[O, Instruction Address Breakpoint] < data 


Programming Notes: 


All MTIAB instructions MUST be followed by a SYNC.P instruction as a barrier to 
guarantee COPO register update. 


Exceptions: 


Coprocessor Unusable exception 
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MTl ABM Move to Instruction Address Mask MTl ABM 


Breakpoint Register 
31 26 25 21 20 1615 1110 32 0 


COPO MTO vs Debug 0 MTIABM 
010000 00100 11000 0000 0000 ~—C«XY:~«11 
6 5 5 5 8 3 


Format: MTIABM rt 


C790 


Description: 


The contents of general register rt are loaded into /nstruction Address Mask Breakpoint 
register of COPO. 


Operation: 


data < GPR[rt] 
CPR[O, Instruction Address Mask Breakpoint] < data 


Programming Notes: 


All MTIABM instructions MUST be followed by a SYNC.P instruction as a barrier to 
guarantee COPO register update. 


Exceptions: 


Coprocessor Unusable exception 


C-34 


TX 
TOSHIBA Appendix C COPO System Control Coprocessor Instruction Set Details es” 


adi Move to Performance Counter anes 


26 25 21 20 1615 11 10 
COPO MTO Perf 
010000 = 00 11 a co 


Format: MTPC rt, reg 


C790 


Description: 
The contents of general register rt are loaded into Performance Counter register. 


The reg OpCode bit indicates the number of Performance Counters. Only register 0 and 1 
are valid in the C790 implementation. 


Operation: 


data <— GPR[rt] 
CPR[O, Performance Counter (reg)] <— data 


Programming Notes: 


All MTPC instructions MUST be followed by a SYNC.P instruction as a barrier to 
guarantee COPO register update. 


Updating the performance counters via a MTPC instruction while the performance 
counters are enabled will result in undefined counter values. 


Exceptions: 


Coprocessor unusable exception 
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mene Move to Performance Event Specifier ee 


26 25 21 20 16 15 1110 
COPO MTO Perf 
010000 ci 00 11 = co 


Format: MTPS rt, reg 


C790 


Description: 
The contents of general register rt are loaded into Performance Control register. 


The reg OpCode bit indicates the number of Performance Control registers. Only register 
0 is valid in the C790 implementation. 


Operation: 


data < GPR[rt] 
CPR[O, Performance Control (reg)] < data 


Programming Notes: 


All MTPS instructions MUST be followed by a SYNC.P instruction as a barrier to 
guarantee COPO register update. 


Exceptions: 


Coprocessor unusable exception 
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TLBP Probe TLB for Matching Entry TLBP 


31 26 25 21 20 65 0 
COPO CO 0 TLBP 
010000 10000 000 0000 0000 0000 001000 
6 5 15 6 


R4000 
Format: TLBP 


Description: 


The /ndex register is loaded with the address of the TLB entry whose contents match the 
contents of the EntryHi register. |f no TLB entry matches, the high-order bit of the /ndex 
register is set to 1. Note that the virtual address in the EntryHi register is masked with 
the corresponding mask field of the TLB entry prior to the comparison. 


The architecture does not specify the operation of memory references associated with the 
instruction immediately after a TLBP instruction, nor is the operation specified if more 
than one TLB entry matches. 


Operation: 


Index <1 || 025 || undefined® 
for i in 0..TLBEnteries-1 


if (TLB[i]o5.77 =( (not TLB[i]127..109) and EntryHisi.13) ) and (TLB[i]v6 or 
(TLB[i]71..64 =EntryHiz7..0)) then 


Index — 026 || is..0 
endif 
endfor 


Programming Notes: 
The TLBP instruction MUST be immediately followed by SYNC.P or ERET instruction 
Exceptions: 


Coprocessor Unusable exception 
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TLB R Read Indexed TLB Entry TLB R 


31 26 25 21 20 65 0 
COPO Co 0 TLBR 
010000 10000 000 0000 0000 0000 000001 
6 5 15 6 


R4000 
Format: TLBR 


Description: 


The EntryHi, EntryLo, and PageMask registers are loaded with the contents of the TLB 
entry pointed at by the contents of the TLB /ndex register. 


The G bit (which controls ASID matching) read from the TLB is written into both of the 
EntryLo0 and EntryLol registers. Depending the value in PageMask register used for a 
TLB write instruction, the value read out from TLB may not retrieve what was originally 
written. See Description for TLBWI/TLBWR instruction. 

Operation: 


PageMask <— TLB[I ndexs..o]127..96 

EntryHi <— (TLB[Indexs..o]9s..77 || 05 || TLB[| ndexs..ol71..64 ) and (not TLB[I ndexs..o]127..96) 
EntryLo0 < TLB[I ndexs..o]e3..33 || TLB[I ndexs..o]76 

EntryLol — TLB[I ndexs..o]s1..1 || TLB[I ndexs..ol76 


Programming Notes: 


The TLBR instruction MUST be executed from either unmapped space or global mapped 
space (mapped space with a TLB entry which has the G bit set). 


The TLBR instruction MUST be immediately followed by SYNC.P or ERET instruction. 
Exceptions: 


Coprocessor Unusable exception 
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TLBWI Write Index TLB Entry TL BWI 


31 26 25 21 20 65 0 
COPO Co 0 TLBWI 
010000 10000 000 0000 0000 0000 000010 
6 5 15 6 


R4000 
Format: TLBWI 


Description: 


The TLB entry pointed at by the contents of the TLB /ndex register is loaded with the 
contents of the PageMask, EntryHi, EntryLoO and EntryL o1 registers. 


The G bit of the TLB is written with the logical AND of the G bits in the EntryL 00 and 
EntryLol registers. The virtual address in the EntryHi register is modified by the Mask 
field of the PageMask register before being written into the TLB. 


The operation is invalid (and the results are unspecified) if contents of the TLB /ndex 
register are greater than the number of TLB entries in the processor. 


In the C790 processor, a TLB write instruction is used to write the whole page frame 
number from the EntryL oregisters tothe TLB entry. Depending on the page size specified 
in the corresponding PageMask register, the lower bits of PFN may not be used for 
address translation and lower bits of VPN2 in EntryHi register which is masked by the 
content of PageMask register are forced to zeros during a TLB write. This does not affect 
TLB address translation, however, a TLB read may not retrieve what was originally 
written. 


Operation: 


TLB[I ndexs.o] < 


PageM ask || ((EntryH i31..13 || (EntryLoOo and EntryLolo) || EntryHi11..0 ) and 
(not PageM ask )) || EntryL 0031.1 || 0 || EntryL o1s1..1 || 0 


Programming Notes: 


The TLBWI instruction MUST be executed from either unmapped space or global mapped 
space (mapped space with a TLB entry which has the G bit set). 


The TLBWI instruction MUST be followed by a ERET or a SYNC.P instruction to insure 
TLB update. 


Exceptions: 


Coprocessor Unusable exception 
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TLBWR Write Random TLB Entry TL BWR 


31 26 25 21 20 65 0 
COPO Co 0 TLBWR 
010000 10000 000 0000 0000 0000 000110 

6 5 15 6 


R4000 
Format: TLBWR 


Description: 


The TLB entry pointed at by the contents of the TLB Random register is loaded with the 
contents of the PageMask, EntryHi, EntryLoO and Entry o1 registers. 


The G bit of the TLB is written with the logical AND of the G bits in the EntryL o0 and 
EntryLol registers. The virtual address in the EntryHi register is modified by the Mask 
field of the PageMask register before being written into the TLB. 


In the C790 processor, a TLB write instruction is used to write the whole page frame 
number from the EntryLl oregisters tothe TLB entry. Depending on the page size specified 
in the corresponding PageMask register, the lower bits of PFN may not be used for 
address translation and lower bits of VPN2 in EntryHi register which is masked by the 
content of PageMask register are forced to zeros during a TLB write. This does not affect 
TLB address translation, however, a TLB read may not retrieve what was originally 
written. 


Operation: 


TLB[Randoms..o] <— 


PageM ask || ((EntryH i31..13 || (EntryLo0o and EntryLolo) || EntryHi11..0 ) and 
(not PageM ask )) || EntryL 0031.1 || 0 || EntryL 0131.1 || 0 


Programming Notes: 


The TLBWR instruction MUST be executed from either unmapped space or global mapped 
space (mapped space with a TLB entry which has the G bit set). 


The TLBWR instruction MUST be followed by a ERET or a SYNC.P instruction to insure 
TLB update. 


Exceptions: 


Coprocessor Unusable exception 
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C.2 COPO0 Instruction Encoding 


31 26 0 
[opcode] C—“(‘“C;C™*CSC*C*‘i*dC 
bits 28..26 Instructions encoded by OpCode field (COP0, CACHE) 
bits 0 1 2 3 4 5 6 7 
31..29 000 001 010 011 100 101 110 111 
0 000 
+ oo1 [abbr | ADDU | scr] suru | ANDI | on! | xoni | wu | 
2 010 
3 on 
4 to [18 | tH | om | iw | cu | iu | wr | ow | 
Sos a RS SAC AC 
[ss | bits 23..21 Instructions encoded by rs field when OpCode field = COPO 
bits 0 1 2 3 4 5 6 7 
25..24 000 001 010 011 100 101 110 111 


31 26 25 2120 1615 1110 


bits 2..0 Instructions encoded by function field when OpCode field = COPO & rd field = Debug 
0 1 2 3 4 5 6 7 
rs field 000 001 010 011 100 101 110 111 


31 26 25 2120 1615 1110 1 


Instructions encoded by function field when OpCode field = COPO & rd field = Perf 


rs field 
MFO 
MTO 


0 1 


* Debug and Perf are the CPO register names. 
Debug = 11000 (24), Perf = 11001 (25) 


ore 
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31 26 25 2120 16 0 


OpCode =| rs =BCO 
COPO 


| rt | bits 18..16 Instructions encoded by rt field when OpCode field = COPO & rs field = BCO 


bits 0 1 2 3 4 5 6 7 


20..19 000 001 010 011 100 101 110 111 


bits 2..0 Instructions encoded by function field when OpCode field = COPO & rs field = CO 


N OO FPR WDM =| CO 


bits a 1 2 3 4 5 6 7 
5..3 001 010 011 100 101 110 111 


This OpCode is reserved for future use. An attempt to execute it causes a 
Reserved Instruction exception. 


This OpCode is reserved for future use. An attempt to execute it produces an 
undefined result. The result may be a Reserved Instruction exception but 
this is not guaranteed. 


This OpCode indicates an instruction class. The instruction word must be 
further decoded by examining additional tables that show the values for 
another instruction field. 


This OpCode is reserved for one of the following instructions which are 
currently not supported: DMULT, DMULTU, DDIV, DDIVU, LL, LLD, SC, 
SCD, LWC2, SWC2. An attempt to execute it causes a Reserved Instruction 
exception. 
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D. COP1 (FPU) Instruction Set Details 


This appendix provides a detailed description of each of the COP1 coprocessor instructions. 
COP1 is implemented as a floating point unit (FPU). 


The instruction descriptions provide: 


e abit by bit field definition of the instruction word signifying that instruction 

e averbal description of the operation performed by the instruction 

e pseudo-code identifying the entire sphere of influence of the instruction in terms 
of operand dependency and the state (s) of the processor changed. 


Omission of any/all states is taken to mean that the same have not changed by the act of 
execution of the instruction under description. 
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D.1 Conventions Used in This Chapter 


D.1.1_ Instruction Description Notation and Functions 


The Operation sections of the instruction descriptions use a high-level language notation, 
or pseudocode, to describe the instruction’s operations. Symbols, functions, and structures 
used in the Operation sections are described here. 


The notation FPR as used here refers to the 32 floating-point registers FPRO through 
FPR31 of the FPU. 


D.1.2 Pseudocode Language Statement Execution 


Each of the high-level language statements in an operation description is executed in 
sequential order (as modified by conditional and loop constructs). 


D.1.3  Pseudocode Symbols 


Special symbols used in the notation are described in Appendix A. 


D.2 Definitions for Pseudocode Functions Used in Operation 
Descriptions 


A variety of functions are used in the pseudocode descriptions to make the pseudocode 
more readable and also to abstract implementation-specific behavior. These functions are 
defined in Appendix A; in addition, certain COP1 FPU-specific functions are described in 
the following section. The following pseudocode notation is used in functions in the 
descriptions of floating-point operations: 


Pseudocode Function PC Meaning 
StoreFPR (fpr, value) FPR[fpr] — value 


ConvertFmt (value, fmt1, fmt2) The value in the format fmt1 is converted to a 
value in the format fmt2. 


Negate (value) The value is negated by changing the sign bit 
value. 

Sign-extend (Value) A sign-extended 32-bit value has bits 63..31 of 
equal value 
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D.3 Instruction Descriptions 


Descriptions of FPU Instructions follow. 
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hain Floating Point Absolute Value oe 


26 25 21 20 16 15 11 10 
COP1 ABS 
ee 0009 alae 
MIPS | 
Format: ABS.S fd, fs 
ABS.D fd, fs 
Purpose: To compute the absolute value of an FP value. 


Description: fd < absolute (fs) 


The absolute value of the value in FPR fs is placed in FPR fd. The operand and result are 
values in format fmt. 


This operation is arithmetic; a NaN operand signals invalid operation. 
Restrictions: 


The field fs and fd must specify FPRs valid for operands of type fmt; see Floating-Point 
Resisters on page 10-2. If they are not valid, the result is undefined. 


Operation: 
StoreFPR (fd, fmt, AbsoluteValue (ValueFPR (fs, fmt))) 


Exceptions: 


Coprocessor Unusable 

Reserved Instruction 

Floating-Point 
Unimplemented Operation 
Invalid Operation 
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aaa Floating Point Add re 


26 25 21 20 16 15 11 10 
COP1 ADD 
ees sete 


Format: ADD.S fd, fs, ft 
ADD.D fd, fs, ft 


MIPS | 


Purpose: To add FP values. 
Description: fd < fs + ft 


The value in FPR ft is added to the value in FPR fs. The result is calculated to infinite 
precision, rounded according to the current rounding mode in FCR31, and placed into FPR 
fd. The operands and result are values in format fmt. 


Restrictions: 


The field fs, ft and fd must specify FPRs valid for operands of type fmt; see Floating-Point 
Resisters on page 10-2. If they are not valid, the result is undefined. 


Operation: 
StoreFPR (fd, fmt, ValueFPR (fs, fmt) + ValueFPR (ft, fmt)) 


Exceptions: 


Coprocessor Unusable 
Reserved Instruction 
Floating-Point 
Unimplemented Operation 
Invalid Operation 
Inexact 
Overflow 
Underflow 
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BC1 F Branch on FP False BC1 F 
31 26 25 2120 16 15 0 
COP1 BC1 BC1F 
6 5 5 16 


MIPS | 
Format: BC1F offset 


Purpose: To test an FP condition code and do a PC-relative conditional branch. 
Description: if (C = 0) then branch where C is FCR3123 


An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of 
the instruction following the branch (not the branch itself), in the branch delay slot, to 
form a PC-relative effective target address. 


If the result of the last floating point compare is false, branch to the effective target 
address after the instruction in the delay slot is executed. 


An FP condition code is set by the FP compare instruction, C.cond.fmt. 


Operation: 
I: condition + (FCR3123 = 0) 
target_offset < (offsetis)GPRLEN-(16+2) || offset || 02 
I+1: if condition then 
PC < PC + target 
endif 
Exceptions: 


Coprocessor Unusable 
Reserved Instruction 


Programming Notes: 


With the 18-bit signed instruction offset, the conditional branch range is + 128KB. Use 
jump (J) or jump register (JR) instructions to branch to more distant addresses. 
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BC1 T Branch on FP True BC1 T 
31 26 25 21 20 16 15 0 
COP1 BC1 BC1T 
: 5 5 16 


MIPS | 
Format: BC1iT offset 


Purpose: To test an FP condition code and do a PC-relative conditional branch. 
Description: if (C = 1) then branch where C is FCR31 23. 


An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of 
the instruction following the branch (not the branch itself), in the branch delay slot, to 
form a PC-relative effective target address. 


If the result of the last floating point compare is true, branch to the effective target 
address after the instruction in the delay slot is executed. 


An FP condition code is set by the FP compare instruction, C.cond.fmt. 


Operation: 
I: condition <— (FCR8123 = 1) 
target < (offsetis)GPRLEN-(16+2) || offset || 02 
I+1: if condition then 
PC < PC + target 
endif 
Exceptions: 


Coprocessor Unusable 
Reserved Instruction 


Programming Notes: 


With the 18-bit signed instruction offset, the conditional branch range is + 128KB. Use 
jump (J) or jump register (JR) instructions to branch to more distant addresses. 
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amen Floating Point Compare aaa 
21 20 1615 1110 65 43 
Roe FC aie 
010001 oo 11 
MIPS | 
Format: C.cond.S fs, ft 


C.cond.D fs, ft 
Purpose: To compare FP values and record the Boolean result in a condition code. 
Description: C < fs compare_cond ft 


The value in FPR fs is compared to the value in FPR ft; the values are in format fmt. The 
comparison is exact and neither overflows nor underflows. If the comparison specified by 
cond 2..11s true for the operand values, then the result is true, otherwise it is false. If no 
exception is taken, the result is written into condition code C; true is 1 and false is 0. 


If cond3is set and at least one of the values is a NaN, an Invalid Operation condition is 
raised; the result depends on the FP exception model currently active. 


The Invalid Operation flag is set in the FCR31. If the Invalid Operation enable bit is set in 
the FCR31, no result is written and an Invalid Operation exception is taken immediately. 
Otherwise, the Boolean result is written into condition code C 


There are four mutually exclusive ordering relations for comparing floating-point values; 
one relation is always true and the others are false. The familiar relations are greater 
than, less than, and equal. In addition, the IEEE floating-point standard defines the 
relation unordered which is true when at least one operand value is NaN; NaN compares 
unordered with everything, including itself. Comparisons ignore the sign of zero, so +0 
equals -0. 


The comparison condition is a logical predicate, or equation, of the ordering relations such 
as “less than or equal”, “equal”, “not less than”, or “unordered or equal”. Compare 
distinguishes sixteen comparison predicates. The Boolean result of the instruction is 
obtained by substituting the Boolean value of each ordering relation for the two FP values 
into equation. If the equal relation is true, for example, then all four example predicates 
above would yield a true result. If the unordered relation is true then only the final 
predicate, “unordered or equal” would yield a true result. 


Logical negation of a compare result allows eight distinct comparisons to test for sixteen 
predicates as shown in Table D-1. Each mnemonic tests for both a predicate and its logical 
negation. For each mnemonic, compare tests the truth of the first predicate. When the 
first predicate is true, the result is true as shown in the “if predicate is true” column (note 
that the False predicate is never true and False/True do not follow the normal pattern). 
When the first predicate is true, the second predicate must be false, and vice versa. The 
truth of the second predicate is the logical negation of the instruction result. After a 
compare instruction, test for the truth of the first predicate with the Branch on FP True 
(BC1T) instruction and the truth of the second with Branch on FP False (BC1F). 
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TOSHIBA 
Table D-1. FPU Comparisons Without Special Operand Exceptions 
Comparison 
Instr Comparison Predicate CC dae Instr 
relation | If Inv cond 
bond name of predicate and logically negated values pred- Op tield 
Nine: predicate (abbreviation) IGale | XCD 
monic >| <]=| ?]is ifQ 3 | 2..0 
true NaN 
E False [this predicate is always False, it | F| F| F) F F 0 
True (T) never has a True result] TI TT] T 
UN Unordered F)/F/F)T}| T { 
Ordered (OR) T/ TIT) FI] F 
EQ Equal FIFI T/F] T 2 
Not Equal (NEQ) TL TL FLT] F 
UEQ Unordered or Equal F)F/T| TT} T 3 
Ordered or Greater than or Less than (OGL) | T|T|F/F| F 
OLT Ordered or Less Than F/T) F/F] T cre 4 
Unordered or Greater than or Equal (UGE) TIF] T)T] F 
ULT Unordered or Less Than F)/T/F)T}| T 5 
Ordered or Greater than or Equal (OGE) T| FIT) FI] F 
OLE Ordered or Less than or Equal FIT|T)F] T 6 
Unordered or Greater Than (UGT) TI FJ FLT] F 
ULE Unordered or Less than or Equal F) T/T) TT] T 7 
Ordered or Greater Than (OGT) TL FL FL FL F 
key: “?” = unordered, “>” = greater than, “<” = less than, “=” is equal, “T” = True, “F” = False 
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There is another set of eight compare operations, distinguished by a cond3 value of 1, 
testing the same sixteen conditions. For these additional comparisons, if at least one of the 
operands is a NaN, including Quiet NaN, then an Invalid Operation condition is raised. If 
the Invalid Operation condition is enabled in the FCR381, then an Invalid Operation 
exception occurs. 


Table D-2 FPU Comparisons With Special Operand Exceptions for QNaNs 


Comparison 
Instr Comparison Predicate CC aes Instr 
relation | If Inv cond 
ee name of predicate and logically negated valties pred- Op field 
MINS predicate (abbreviation) eaten | eee 
monic >| <| =| ?] is ifQ 3 | 2..0 
true NaN 
SF Signaling False [this predicate | F| F| Fj F F 0 
Signaling True (ST) always False] | T| T| T| T 
NGLE Not Greater than or Less than or Equal FTF) FI) TT} T { 
Greater than or Less than or Equal (GLE) T|T|T| Fl F 
SEQ Signaling Equal FIFI] T|) FF] T 2 
Signaling Not Equal (SNE) TL T/L FLT] F 
NGL Not Greater than or Less than FIFI] T|T} T 3 
Greater than or Less than (GL) T|T| FIFI] F 
L7_ | Less Than FITTFIF, T | “| tT, 
Not Less Than (NLT) TIFIT|T] F 
NGE Not Greater than or Equal FIT|F]T} T 5 
Greater than or Equal (GE) T/ FIT) FI] F 
LE Less than or Equal F/T) T|F] T 6 
Not Less than or Equal (NLE) T| FL FLT] F 
Not Greater Than F/T) T/T] T 
NGT | Greater Than (GT) PERE LEE G 
key: “?” = unordered, “>” = greater than, “<” = less than, “=” is equal, “T” = True, “F” = False 


Restrictions: 


The field fs and ft must specify FPRs valid for operands of type fmt; see Floating-Point 
Resisters on page 10-2. If they are not valid, the result is undefined. 


Operation: 


if NaN (Value FPR (fs, fmt)) or NaN (ValueFPR (ft, fmt)) then 
less < false 
equal < false 
unordered < true 
if t then 
SignalException (InvalidOperation) 
endif 
else 
less + ValueFPR (fs, fmt) < ValueFPR (ft, fmt) 
equal «+ ValueFPR (fs, fmt) = ValueFPR (ft, fmt) 
unordered < false 
endif 
condition < (cond2 and less) or (condi and equal) or (condo and unordered) 
C « condition 
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Exceptions: 


Coprocessor Unusable 

Reserved Instruction 

Floating-Point 
Unimplemented Operation 
Invalid Operation 


Programming Notes: 


FP computational instructions, including compare, that receive an operand value of 
Signaling NaN, will raise the Invalid Operation condition. The comparisons that raise the 
Invalid Operation condition for Quiet NaNs in addition to SNaNs, permit a simpler 
programming model if NaNs are errors. Using these compares, programs do not need 
explicit code to check for QNaNs causing the unordered relation. Instead, they take an 
exception and allow the exception handling system to deal with the error when it occurs. 
For example, consider a comparison in which we want to know if two numbers are equal, 
but for which unordered would be an error. 


# comparisons using explicit tests for QNaN 
c.eq.d $f2,$f4 # check for equal 
nop 
belt L2 # it is equal 
c.un.d $f2,$f4 # it is not equal, but might be unordered 
belt ERROR# unordered goes off to an error handler 
# not-equal-case code here 


# equal-case code here 
L2: 


# comparison using comparisons that signal QNaN 
c.seq.d $f2,$f4 # check for equal 
nop 
belt L2 # it is equal 
nop 
# it is not unordered here... 
# not-equal-case code here 


#equal-case code here 
L2: 
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CEl L. L.fmt Floating-Point Ceiling Convert to Long Fixed-Point CEl L. L.fmt 


31 26 25 21 20 16 15 11 10 6 5 0 
COP1 fat 0 is id CEIL.L 
010001 00000 001010 
6 5 5 5 5 6 
MIPS Ill 
Format: CEIL.L.S fd, fs 
CEIL.L.D fd, fs 
Purpose: To convert an FP value to 64-bit fixed-point, rounding up. 


Description: fd < convert_and_round (fs) 


The value in FPR fs in format fmt, is converted to a value in 64-bit long fixed-point format 
rounding toward +o (rounding mode 2). The result is placed in FPR fd. 


When the source value is Infinity, NaN, or rounds to an integer outside the range -2® to 
263 -1, the result cannot be represented correctly and an IEEE Invalid Operation condition 
exists. 


The Invalid Operation flag is set in the FCR31. If the Invalid Operation enable bit is set in 
the FCR31, no result is written to fd and an Invalid Operation exception is taken 
immediately. Otherwise, the default result, 268 —1, is written to fd. 


Restrictions: 


The fields fs and fd must specify valid FPRs; fs for type fmt and fd for long fixed-point; see 
Floating-Point Registers on page 10-2. If they are not valid, the result is undefined. 


Operation: 
StoreFPR (fd, L, ConvertFmt (ValueFPR (fs, fmt), fmt, L)) 
Exceptions: 


Coprocessor Unusable 
Reserved Instruction 
Floating-Point 
Invalid Operation 
Unimplemented Operation 
Inexact 
Overflow 
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cn L.W.fmt Floating-Point Ceiling Convert to Word Fixed-Point iio El ee 


26 25 21 20 16 15 
COP1 EE "ig 
01 she 009 001 ; 10 
MIPS Il 
Format: CEIL.W.S fd, fs 
CEIL.W.D fd, fs 
Purpose: To convert an FP value to 32-bit fixed-point, rounding up. 


Description: fd < convert_and_round (fs) 


The value in FPR fs in format fmt, is converted to a value in 32-bit word fixed-point 
format rounding toward +o (rounding mode 2). The result is placed in FPR fd. 


When the source value is Infinity, NaN, or rounds to an integer outside the range -23! to 
21-1, the result cannot be represented correctly and an IEEE Invalid Operation condition 
exists. 


The Invalid Operation flag is set in the FCR31. If the Invalid Operation enable bit is set in 
the FCR31, no result is written to fd and an Invalid Operation exception is taken 
immediately. Otherwise, the default result, 23! —1, is written to fd. 


Restrictions: 


The fields fs and fd must specify valid FPRs; fs for type fmt and fd for word fixed-point; 
see Floating-Point Registers on page 10-2. If they are not valid, the result is undefined. 


Operation: 
StoreFPR (fd, W, ConvertFmt (ValueFPR (fs, fmt), fmt, W)) 
Exceptions: 


Coprocessor Unusable 
Reserved Instruction 
Floating-Point 
Invalid Operation 
Unimplemented Operation 
Inexact 
Overflow 
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CFC1 Move Control Word from Floating Point CFC1 


31 26 25 21 20 16 15 11 10 0 
COP1 CFC1 re fe 0 
010001 00010 000 0000 0000 
6 5 5 5 11 
MIPS | 
Format: CFC1 rt, fs 
Purpose: To copy a word from an FPU control register to a GPR. 


Description: rt << FP_Control[fs] 


Copy the 32-bit word from FP (coprocessor 1) control register fs into GPR rt, sign- 
extending it if the GPR is 64 bits. 


Restrictions: 


There are only a couple control registers defined for the floating point unit. The result is 
not defined if fs specifies a register that does not exist. 


Operation: 
GPR[rt] < sign_extend (FCR[fs]) 
Exceptions: 


Coprocessor Unusable 
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CTC1 Move Control Word to Floating Point CTC1 


31 26 25 21 20 16 15 11 10 0 
COP1 CTC1 re fe 0 
010001 00110 000 0000 0000 
6 5 5 5 11 
MIPS | 
Format: CTC1 rt, fs 
Purpose: To copy a word from a GPR to an FPU control register. 


Description: FP_Control[fs] < rt 
Copy the low word from GPR rt into FP (coprocessor 1) control register fs. 


Writing to control register 31, the Floating-Point Control and Status Register or FCR31, 
causes the appropriate exception if any cause bit and its corresponding enable bit are both 
set. The register will be written before the exception occurs. 


Restrictions: 


There are only a couple control registers defined for the floating point unit. The result is 
not defined if fs specifies a register that does not exist. 


Operation: 


temp < GPR[rt]s1..0 
FCR[fs] << temp 


Exceptions: 


Coprocessor Unusable 
Reserved Instruction 
Floating-Point 

Invalid Operation 

Unimplemented Operation 

Inexact 

Overflow 

Underflow 

Division by Zero 


D-15 


TX 
TOSHIBA Appendix D COP1 (FPU) Instruction Set Details We” 


oor D.fmt Floating-Point Convert to Double Foating Point a D.fmt 


26 25 21 20 16 15 
COP1 CVT.D 
01 she 0000 1 ae 
MIPS I, Ill 
Format: CVT.D.S fd, fs 
CVT.D.W fd, fs 
CVT.D.L fd, fs 
Purpose: To convert an FP or fixed-point value to double FP. 


Description: fd < convert_and_round (fs) 


The value in FPR fs in format fmt is converted to a value in double floating-point format 
rounded according to the current rounding mode in FCR31. The result is placed in FPR fd. 


If fmt is S or W, then the operation is always exact. 


Restrictions: 


The field fs and fd must specify valid FPRs; fs for type fmt and fd for double floating point; 
see Floating-Point Resisters on page 10-2. If they are not valid, the result is undefined. 


Operation: 
StoreFPR (fd, D, ConvertFmt (ValueFPR (fs, fmt), fmt, D)) 


Exceptions: 


Coprocessor Unusable 
Reserved Instruction 
Floating-Point 
Invalid Operation 
Unimplemented Operation 
Inexact 


Note: 


Overflow and Underflow exceptions never occur because double precision data format can 
represent any value in other data types. 
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CVT. L.fmt Floating-Point Convert to Long Fixed-Point CVT. L.fmt 


31 26 25 21 20 16 15 11 10 6 5 0 

COP1 imt 0 f td CVT.L 

010001 m 00000 7 100101 

6 5 5 . 5 6 
MIPS III 
Format: CVT.L.S fd, fs 
CVT.L.D fd, fs 

Purpose: To convert an FP value to a 64-bit fixed-point. 


Description: fd < convert_and_round (fs) 


Convert the value in format fmt in FPR fs to long fixed-point format, round according to 
the current rounding mode in FCR31, and place the result in FPR fd. 


When the source value is Infinity, NaN, or rounds to an integer outside the range -2° to 
263 -1, the result cannot be represented correctly and an IEEE Invalid Operation condition 
exists. 


The Invalid Operation flag is set in the FCR31. If the Invalid Operation enable bit is set in 
the FCR31, no result is written to fd and an Invalid Operation exception is taken 
immediately. Otherwise, the default result, 2®°—1, is written to fd. 


Restrictions: 


The field fs and fd must specify valid FPRs; fs for type fmt and fd for long floating point; 
see Floating-Point Resisters on page 10-2. If they are not valid, the result is undefined. 


Operation: 
StoreFPR (fd, L, ConvertFmt (ValueFPR (fs, fmt), fmt, L)) 


Exceptions: 


Coprocessor Unusable 
Reserved Instruction 
Floating-Point 
Invalid Operation 
Unimplemented Operation 
Inexact 
Overflow 
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CVT.S.fmt Floating-Point Convert to Single Floating-Point CVT.S.fmt 


31 26 25 21 20 16 15 11 10 65 0 
6 5 5 5 5 6 
MIPS I, Ill 
Format: CVT.S.D fd, fs 
CVT.S.W fd, fs 
CVT.S.L fd, fs 
Purpose: To convert an FP or fixed-point value to single FP. 


Description: fd < convert_and_round (fs) 


The value in FPR fs in format fmt is converted to a value in single floating-point format 
rounded according to the current rounding mode in FCR31. The result is placed in FPR fd. 


Restrictions: 


The field fs and fd must specify valid FPRs; fs for type fmt and fd for single floating point; 
see Floating-Point Resisters on page 10-2. If they are not valid, the result is undefined. 


Operation: 
StoreFPR (fd, S, ConvertFmt (ValueFPR (fs, fmt), fmt, S)) 


Exceptions: 


Coprocessor Unusable 
Reserved Instruction 
Floating-Point 
Invalid Operation 
Unimplemented Operation 
Inexact 
Overflow 
Underflow 


D-18 


TOSHIBA Appendix D COP1 (FPU) Instruction Set Details ye ee” 


Oe Floating-Point Convert to Word Fixed-Point eee 


26 25 21 20 16 15 
COP1 Gh W 
01 she 0009 1 sa 00 
MIPS | 
Format: CVT.W.S fd, fs 
CVT.W.D fd, fs 
Purpose: To convert an FP value to a 32-bit fixed-point. 


Description: fd < convert_and_round (fs) 


The value in FPR fs in format fmt is converted to a value in 32-bit word fixed-point format 
rounded according to the current rounding mode in FCR31. The result is placed in FPR fd. 


When the source value is Infinity, NaN, or rounds to an integer outside the range -2?! to 
231-1, the result cannot be represented correctly and an IEEE Invalid Operation condition 
exists. 


The Invalid Operation flag is set in the FCR31. If the Invalid Operation enable bit is set in 
the FCR31, no result is written to fd and an Invalid Operation exception is taken 
immediately. Otherwise, the default result, 2°! —1, is written to fd. 


Restrictions: 


The field fs and fd must specify valid FPRs; fs for type fmt and fd for word fixed point; see 
Floating-Point Resisters on page 10-2. If they are not valid, the result is undefined. 


Operation: 
StoreFPR (fd, W, ConvertFmt (ValueFPR (fs, fmt), fmt, W)) 
Exceptions: 


Coprocessor Unusable 
Reserved Instruction 
Floating-Point 
Invalid Operation 
Unimplemented Operation 
Inexact 
Overflow 
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veal Floating Point Divide ee 


26 25 21 20 16 15 11 10 
COP1 DIV 
ee sl 
MIPS | 
Format: DIV.S fd, fs, ft 
DIV.D fd, fs, ft 
Purpose: To divide FP values. 


Description: fd < fs/ft 


The value in FPR fs is divided by the value in FPR ft. The result is calculated to infinite 
precision, rounded according to the current rounding mode in FCR31, and placed into FPR 
fd. The operands and result are values in format fmt. 


Restrictions: 


The field fs, ft and fd must specify FPRs valid for operands of type fmt; see Floating-Point 
Resisters on page 10-2. If they are not valid, the result is undefined. 


Operation: 
StoreFPR (fd, fmt, ValueFPR (fs, fmt) / ValueFPR (ft, fmt)) 


Exceptions: 


Coprocessor Unusable 
Reserved Instruction 
Floating-Point 

Inexact 

Unimplemented Operation 

Division-by-zero 

Invalid Operation 

Overflow 

Underflow 
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DMFC1 Doubleword Move From Floating-Point DM FC1 


31 26 25 21 20 16 15 1110 ) 
COP1 DMFC1 rt f 0 
010001 00001 > 000 0000 0000 
6 5 5 5 11 
MIPS III 
Format: DMFC‘1 rt, fs 


Purpose: To copy a doubleword from an FPR to a GPR. 
Description: rt < fs 
The doubleword contents of FPR fs are placed into GPR rt. 


If the coprocessor 1 general registers are 32-bits wide (a native 32-bit processor or 32-bit 
register emulation mode in a 64-bit processor), FPR fs is held in an even/odd register pair. 
The low word is taken from the even register fs and the high word is from fs+1. 


Restrictions: 


If fs does not specify an FPR that can contain a doubleword, the result is undefined; see 
Floating Point Registers on page 10-2. 


Operation: 


if SizeFGRO = 64 then /* 64-bit wide FGRs */ 
data + FGR[fs] 

elseif fso = 0 then /* valid specifier, 32-bit wide FGRs */ 
data + FGR[fs+1] || FGR[fs] 

else /* undefined for odd 32-bit FGRs */ 
UndefinedResult() 

endif 

GPR[rt] < data 


Exceptions: 


Reserved Instruction 
Coprocessor Unusable 
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DMTC1 Doubleword Move To Floating-Point DMTC1 


31 26 25 21 20 16 15 1110 0 
COP1 DMTC1 : ' 0 
010001 00101 s 000 0000 0000 
6 5 5 5 11 
MIPS Il 
Format: DMTC‘1 rt, fs 


Purpose: To copy a doubleword from a GPR to an FPR. 
Description: fs < rt 
The doubleword contents of GPR rt are placed into FPR fs. 


If the coprocessor 1 general registers are 32-bits wide (a native 32-bit processor or 32-bit 
register emulation mode in a 64-bit processor), FPR fs is held in an even/odd register pair. 
The low word is Placed in the even register fs and the high word is placed in fs+1. 


Restrictions: 


If fs does not specify an FPR that can contain a doubleword, the result is undefined; see 
Floating Point Registers on page 10-2. 


Operation: 


data + GPR[rt] 

if SizeFGRO = 64 then /* 64-bit wide FGRs */ 
FGR[fs] < data 

elseif fso = 0 then /* valid specifier, 32-bit wide FGRs */ 
FGR[fs+1] < data6s..32 
FGR[fs] < data31..0 

else /* undefined result for odd 32-bit FGRs */ 
UndefinedResult() 

endif 


Exceptions: 


Reserved Instruction 
Coprocessor Unusable 
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easton Floating-Point Floor Convert to Long ae 


Fixed-Point 
26 25 21 20 16 15 11 10 
COP1 Toon a 
01 she 0009 cs 011 
MIPS III 
Format: FLOOR.L.S fd, fs 
FLOOR.L.D fd, fs 

Purpose: To convert an FP value to a 64-bit fixed-point, rounding down. 


Description: fd < convert_and_round (fs) 


The value in FPR fs in format fmt, is converted to a value in 64-bit long fixed-point format 
rounding toward —o (rounding mode 3). The result is placed in FPR fd. 


When the source value is Infinity, NaN, or rounds to an integer outside the range -26 to 
263 -1, the result cannot be represented correctly and an IEEE Invalid Operation condition 
exists. 


The Invalid Operation flag is set in the FCR31. If the Invalid Operation enable bit is set in 
the FCR31, no result is written to fd and an Invalid Operation exception is taken 
immediately. Otherwise, the default result, 268 —1, is written to fd. 


Restrictions: 


The field fs and fd must specify valid FPRs; fs for type fmt and fd for long fixed point; see 
Floating-Point Resisters on page 10-2. If they are not valid, the result is undefined. 


Operation: 
StoreFPR (fd, L, ConvertFmt (ValueFPR (fs, fmt), fmt, L)) 
Exceptions: 


Coprocessor Unusable 
Reserved Instruction 
Floating-Point 
Invalid Operation 
Unimplemented Operation 
Inexact 
Overflow 
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FLOOR.W.IM1t Floating-Point Floor Convert to Word FLOOR.W.fmt 


Fixed-Point 
31 26 25 21 20 16 15 11 10 6 5 0 
COP1 0 FLOOR.W 
6 5 5 5 5 6 
MIPS Il 

Format: FLOOR.W-.S fd, fs 

FLOOR.W.D fd, fs 
Purpose: To convert an FP value to a 32-bit fixed-point, rounding down. 


Description: fd < convert_and_round (fs) 


The value in FPR fs in format fmt, is converted to a value in 32-bit word fixed-point 
format rounding toward —o (rounding mode 3). The result is placed in FPR fd. 


When the source value is Infinity, NaN, or rounds to an integer outside the range -23! to 
21-1, the result cannot be represented correctly and an IEEE Invalid Operation condition 
exists. 


The Invalid Operation flag is set in the FCR31. If the Invalid Operation enable bit is set in 
the FCR31, no result is written to fd and an Invalid Operation exception is taken 
immediately. Otherwise, the default result, 23! —1, is written to fd. 


Restrictions: 


The field fs and fd must specify valid FPRs; fs for type fmt and fd for word fixed point; see 
Floating-Point Resisters on page 10-2. If they are not valid, the result is undefined. 


Operation: 
StoreFPR (fd, W, ConvertFmt (ValueFPR (fs, fmt), fmt, W)) 
Exceptions: 


Coprocessor Unusable 
Reserved Instruction 
Floating-Point 
Invalid Operation 
Unimplemented Operation 
Inexact 
Overflow 
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LDC1 


31 


Load Doubleword to Floating-Point LDC1 


26 25 21 20 16 15 0 


LDC1 
6 5 5 


Format: 


Purpose: 


16 


MIPS II 
LDC1 ft, offset (base) 


To load a doubleword from memory to an FPR. 


Description: ft memory[base-+offset] 


The contents of the 64-bit doubleword at the memory location specified by the aligned 
effective address are fetched and placed in FPR ft. The 16-bit signed offset is added to the 
contents of GPR base to form the effective address. 


If coprocessor 1 general registers are 32-bits wide (a native 32-bit processor or 32-bit 
register emulation mode in a 64-bit processor), FPR ft is held in an even/odd register pair. 
The low word is placed in the even register ft and the high word is placed in ft+1. 


Restrictions: 


If ft does not specify an FPR that can contain a doubleword, the result is undefined; see 
Floating-Point Resisters on page 10-2. 


An Address Error exception occurs if EffectiveAddressz2..0 # 0 (not doubleword-aligned). 


Operation: 


vAddr ¢ sign_extend (offset) + GPR[base] 

if vAddr2..0 # 0° then SignalException (AddressError) endif 

(pAddr, uncached) « AddressTranslation (vAddr, DATA, LOAD) 

data < LoadMemory (uncached, DOUBLEWORD, pAddr, vAddr, DATA) 


if SizeFGRO = 64 then /* 64-bit wide FGRs */ 
FGR[ft] < data 
elseif fto = 0 then /* valid specifier, 32-bit wide FGRs */ 


FGR[ft+1] < data63..32 
FGR[ft] < data31..0 

else /* undefined result for odd 32-bit FGRs */ 
UndefinedResult() 

endif 


Exceptions: 


Coprocessor Unusable 
TLB Refill 

TLB Invalid 

Address Error 
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LWC1 Load Word to Floating Point LWC1 


31 26 25 21 20 16 15 0 
Ee 
6 5 5 16 
MIPS | 
Format: LWC1 ft, offset (base) 
Purpose: To load a word from memory to an FPR. 


Description: ft << memory[base-+offset] 


The contents of the 32-bit word at the memory location specified by the aligned effective 
address are fetched and placed into the low word of coprocessor 1 general register ft . The 
16-bit signed offset is added to the contents of GPR base to form the effective address. 


If coprocessor 1 general registers are 64-bits wide, bits 63..32 of register ft become 
undefined. See Floating Point Register on page 10-2. 


Restrictions: 
An Address Error exception occurs if EffectiveAddress1.0 # 0 (not word-aligned). 


Operation: 32-bit Processors 


I: /* “mem” is aligned 64-bits from memory. Pick out correct bytes. */ 
vAddr < sign_extend (offset) + GPR[base] 
if vAddr1..0 4 0? then SignalException (AddressError) endif 
(pAddr, uncached) «+ AddressTranslation (vAddr, DATA, LOAD) 
mem < LoadMemory (uncached, WORD, pAddr, vAddr, DATA) 
I+1: FGR[ft] <— mem 


Operation: 64-bit Processors 


/* “mem” is aligned 64-bits from memory. Pick out correct bytes. */ 
vAddr < sign_extend (offset) + GPR[base] 

if vAddr1..0 # 0? then SignalException (AddressError) endif 
(pAddr, uncached) «+ AddressTranslation (vAddr, DATA, LOAD) 
pAddr < pAddr PSIZE-1..3 || (pAddr2..0 xor (ReverseEndian || 0? )) 
mem < LoadMemory (uncached, WORD, pAddr, vAddr, DATA) 
bytesel + vAddrz..0 xor (BigEndianCPU || 02) 

if SizeFGRO = 64 then /* 64-bit wide FGRs */ 
FGR[ft] < undefined 22 || mem31+8*bytesel..8*bytesel 

else /* 32-bit wide FGRs */ 
FGR[ft] < mem31+8*bytesel..8*bytesel 

endif 


Exceptions: 
Coprocessor unusable 
TLB Refill 


TLB Invalid 
Address Error 
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M FC1 Move Word from Floating Point M FC1 


31 26 25 21 20 16 15 11 10 0 
010001 00000 000 0000 0000 
6 5 5 5 11 
MIPS | 
Format: MFC1 rt, fs 


Purpose: To copy a word from an FPU (COP1) general register to a GPR. 
Description: rt < fs 


The low word from FPR fs is placed into the low word of GPR rt. If GPR rt is 64 bits wide, 
then the value is sign extended. See Floating Point Resisters on page 10-2. 


Restrictions: 
None 


Operation: 
GPR[rt] < sign_extend (FPR[fs]31.0) 
Exceptions: 


Coprocessor Unusable 
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aes Floating Point Move ee 


26 25 21 20 16 15 11 10 


COP1 MOV 
01 oe 0009 st 10 


Format: MOV.S fd, fs 
MOV.D fd, fs 


MIPS | 


Purpose: To move an FP value between FPRs. 
Description: fd < fs 


The value in FPR fs is placed into FPR fd . The source and destination are values in 
format fmt. 


The move is non-arithmetic; it causes no IEEE 754 exceptions. 


Restrictions: 


The field fs and fd must specify FPRs valid for operands of type fmt; see Floating-Point 
Resisters on page 10-2. If they are not valid, the result is undefined. 


Operation: 
StoreFPR (fd, fmt, ValueFPR (fs, fmt)) 


Exceptions: 
Coprocessor Unusable 
Reserved Instruction 


Floating-Point 
Unimplemented Operation 
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MTC1 Move Word to Floating Point MTC1 


31 26 25 21 20 16 15 11 10 0 
010001 00100 000 0000 0000 
6 5 5 5 11 
MIPS I 
Format: MTC1 rt, fs 


Purpose: To copy a word from a GPR to an FPU (COP1) general register. 
Description: fs < rt 


The low word in GPR rt is placed into the low word of floating-point (coprocessor 1) 
general register fs. If coprocessor 1 general registers are 64-bits wide, bits 63..32 of 
register fs become undefined. See Floating-Point Registers on page 10-2. 


Operation: 
data + GPR[rt]31..0 
if SizeFGRO = 64 then /* 64-bit wide FGRs */ 
FGR[fs] < undefined? || data 
else /* 32-bit wide FGRs */ 
FGR[fs] < data 
endif 


Exceptions: 


Coprocessor Unusable 


D-29 


TX 
TOSHIBA Appendix D COP1 (FPU) Instruction Set Details Nie 


a L.fmt Floating Point Multiply M : L.fmt 


26 25 21 20 16 15 1110 
COP1 MUL 
adie ome 


Format: MUL.S fd, fs, ft 
MUL.D fd, fs, ft 


MIPS | 


Purpose: To multiply FP values. 
Description: fd < fs x ft 


The value in FPR fs is multiplied by the value in FPR ft. The result is calculated to 
infinite precision, rounded according to the current rounding mode in FCR31, and placed 
into FPR fd. The operands and result are value in format fmt. 


Restrictions: 


The field fs, ft and fd must specify FPRs valid for operands of type fmt; see Floating-Point 
Resisters on page 10-2. If they are not valid, the result is undefined. 


Operation: 
StoreFPR (fd, fmt, ValueFPR (fs, fmt) * ValueFPR (ft, fmt)) 


Exceptions: 


Coprocessor Unusable 
Reserved Instruction 
Floating-Point 
Inexact 
Unimplemented Operation 
Invalid Operation 
Overflow 
Underflow 
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eae Floating Point Negate hermes 


26 25 21 20 16 15 11 10 


COP1 NEG 
01 oe 0009 ee 11 


Format: NEG.S fd, fs 
NEG.D fd, fs 


MIPS | 


Purpose: To negate a floating-point value. 
Description: fd < -(fs) 


The value in FPR fs is negated and placed into FPR fd. The value is negated by changing 
the sign bit value. The operand and result are values in format fmt. 


This operation is arthmetic; a NaN operand signals invalid operation. 


Restrictions: 


The field fs and fd must specify FPRs valid for operands of type fmt; see Floating-Point 
Resisters on page 10-2. If they are not valid, the result is undefined. 


Operation: 
StoreFPR (fd, fmt, Negate (ValueFPR (fs, fmt)) 


Exceptions: 


Coprocessor Unusable 

Reserved Instruction 

Floating-Point 
Unimplemented Operation 
Invalid Operation 
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ROUND.L.fmt Floating Point Round to Long Fixed- ROUND.L.fmt 


Point 

31 26 25 21 20 16 15 11 10 6 5 0 

COP1 0 ROUND.L 

010001 fmt | 00000 is | ie 001000 

6 5 5 5 5 6 
MIPS III 
Format: ROUND.L.S fd, fs 
ROUND.L.D fd, fs 

Purpose: To convert an FP value to 64-bit fixed-point, round to nearest. 


Description: fd < convert_and_round (fs) 


The value in FPR fs in format fmt, is converted to a value in 64-bit long fixed-point format 
rounding to nearest/even (rounding mode 0). The result is placed in FPR fd. 


When the source value is Infinity, NaN, or rounds to an integer outside the range -26 to 
23 -1, the result cannot be represented correctly and an IEEE Invalid Operation condition 
exists. 


The Invalid Operation flag is set in the FCR31. If the Invalid Operation enable bit is set in 
the FCR31, no result is written to fd and an Invalid Operation exception is taken 
immediately. Otherwise, the default result, 268 —1, is written to fd. 


Restrictions: 


The field fs and fd must specify valid FPRs; fs for type fmt and fd for long fixed point; see 
Floating-Point Resisters on page 10-2. If they are not valid, the result is undefined. 


Operation: 
StoreFPR (fd, L, ConvertFmt (ValueFPR (fs, fmt), fmt,L) 
Exceptions: 


Coprocessor Unusable 
Reserved Instruction 
Floating-Point 

Inexact 

Unimplemented Operation 

Overflow 

Invalid Operation 
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ROUND.W.fmt Floating Point sane to Word Fixed- ROUND.W.fmt 


31 26 25 21 20 16 15 11 10 65 0 
COP1 0 ROUND.W 
010001 int 00000 . | id 001100 
6 5 5 5 5 6 
MIPS II 
Format: ROUND.W-.S fd, fs 


ROUND.W.D fd, fs 
Purpose: To convert an FP value to 32-bit fixed-point, round to nearest. 
Description: fd < convert_and_round (fs) 


The value in FPR fs in format fmt, is converted to a value in 32-bit word fixed-point 
format rounding to nearest/even (rounding mode 0). The result is placed in FPR fd. 


When the source value is Infinity, NaN, or rounds to an integer outside the range -23! to 
21-1, the result cannot be represented correctly and an IEEE Invalid Operation condition 
exists. 


The Invalid Operation flag is set in the FCR31. If the Invalid Operation enable bit is set in 
the FCR31, no result is written to fd and an Invalid Operation exception is taken 
immediately. Otherwise, the default result, 23! —1, is written to fd. 


Restrictions: 


The field fs and fd must specify valid FPRs; fs for type fmt and fd for word fixed point; see 
Floating-Point Resisters on page 10-2. If they are not valid, the result is undefined. 


Operation: 
StoreFPR (fd, W, ConvertFmt (ValueFPR (fs, fmt), fmt,W) 


Exceptions: 


Coprocessor Unusable 
Reserved Instruction 
Floating-Point 

Inexact 

Unimplemented Operation 

Overflow 

Invalid Operation 
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SDC1 Store Doubleword to Floating-Point SDC1 


31 26 25 21 20 16 15 0 
6 5 5 16 
MIPS Il 
Format: SDC1 ft, offset (base) 
Purpose: To store a doubleword from an FPR to memory. 


Description: memory[base+offset] < ft 


The 64-bit doubleword in FPR ft is stored in memory at the location specified by the 
aligned effective address. The 16-bit signed offset is added to the contents of GPR base to 
form the effective address. 


If coprocessor 1 general registers are 32-bits wide (a native 32-bit processor or 32-bit 
register emulation mode in a 64-bit processor), FPR ft is held in an even/odd register pair. 
The low word is taken from the even register ft and the high word is from ft+1. 


Restrictions: 


If ft does not specify an FPR that can contain a doubleword, the result is undefined; see 
Floating-Point Resisters on page 10-2. 


An Address Error exception occurs if EffectiveAddressz..0 # 0 (not doubleword-aligned). 


Operation: 


vAddr ¢ sign_extend (offset) + GPR[base] 

if vAddr2..0 # 0° then SignalException (AddressError) endif 

(pAddr, uncached) « AddressTranslation (vAddr, DATA, STORE) 

if SizeFGRO = 64 then /* 64-bit wide FGRs */ 
data + FGR[ft] 

elseif fto = 0 then /* valid specifier, 32-bit wide FGRs */ 
data — FGR[ft+1] || FGR[ft] 

else /* undefined for odd 32-bit FGRs */ 
UndefinedResult() 

endif 

StoreMemory(uncached, DOUBLEWORD, data, pAddr, vAddr, DATA) 


Exceptions: 


Coprocessor Unusable 
TLB Refill 

TLB Invalid 

TLB Modified 
Address Error 
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ee Floating Point Square Root ies 


26 25 21 20 1615 11 10 
COP1 SQRT 
01 uae co 0001 os 
MIPS Il 
Format: SQRT.S fd, fs 
SQRT.D fd, fs 
Purpose: To compute the square root of an FP value. 


Description: fd < SQRT (fs) 


The square root of the value in FPR fs is calculated to infinite precision, rounded 
according to the current rounding mode in FCR31, and placed into FPR fd. The operand 
and result are values in format fmt. 


If the value in FPR fs corresponds to —0, the result will be —0. 
Restrictions: 
If the value in FPR fs is less than 0, an Invalid Operation condition is raised. 


The field fs and fd must specify FPRs valid for operands of type fmt; see Floating-Point 
Resisters on page 10-2. If they are not valid, the result is undefined. 


Operation: 
StoreFPR (fd, fmt, SquareRoot (FPR (fs, fmt))) 
Exceptions: 


Coprocessor Unusable 
Reserved Instruction 
Floating-Point 

Inexact 

Unimplemented Operation 

Invalid Operation 
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a= B.fmt Floating Point Subtract B.fmt 


26 25 21 20 16 15 1110 
COP1 SUB 
eee sa 


Format: SUB.S fad, fs, ft 
SUB.S fd, fs, ft 


MIPS | 


Purpose: To subtract FP values. 
Description: fd < fs - ft 


The value in FPR ft is subtracted from the value in FPR fs. The result is calculated to 
infinite precision, rounded according to the current rounding mode in FCR31, and placed 
into FPR fd. The operands and result are value in format fmt. 


Restrictions: 


The field fs, ft, and fd must specify FPRs valid for operands of type fmt; see Floating-Point 
Resisters on page 10-2. If they are not valid, the result is undefined. 


Operation: 
StoreFPR (fd, fmt, ValueFPR (fs, fmt) — ValueFPR (ft, fmt)) 


Exceptions: 


Coprocessor Unusable 
Reserved Instruction 
Floating-Point 
Inexact 
Unimplemented Operation 
Invalid Operation 
Overflow 
Underflow 
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SWC1 Store Word from Floating Point SWC1 


31 26 25 21 20 16 15 0 
6 5 5 16 
MIPS | 
Format: SWC1 ft, offset (base) 
Purpose: To store a word from an FPR to memory. 


Description: memory[base+offset] < ft 


The low 32-bit word from FPR ft is stored in memory at the location specified by the 
aligned effective address. The 16-bit signed offset is added to the contents of GPR base to 
form the effective address. 


Restrictions: 
An Address Error exception occurs if EffectiveAddress1.0 # 0 (not word-aligned). 


Operation: 32-bit Processors 


vAddr ¢ sign_extend (offset) + GPR[base] 

if vAddr1..0 # 02 then SignalException (AddressError) endif 
(pAddr, uncached) « AddressTranslation (vAddr, DATA, STORE) 
data + FGR{ft] 

StoreMemory (uncached, WORD, data, pAddr, vAddr, DATA) 


Operation: 64-bit Processors 


vAddr ¢ sign_extend (offset) + GPR[base] 

if vAddr1..0 # 0? then SignalException (AddressError) endif 
(pAddr, uncached) « AddressTranslation (vAddr, DATA, STORE) 
pAddr < pAddr PSIZE-1..3 || (pAddr2..0 xor (ReverseEndian || 0? )) 
bytesel + vAddrz..0 xor (BigEndianCPU || 02) 

/* the bytes of the word are moved into the correct byte lanes */ 


if SizeFGRO = 64 then /* 64-bit wide FGRs */ 
data <— 022-8*bytesel || FGR[ft]31..0 || O8*bytesel /* top or bottom wd of 64-bit data */ 
else /* 32-bit wide FGRs */ 
data < 032-8*bytesel || FGR/ft] || 08"Pytesel /* top or bottom wd of 64-bit data */ 
endif 
StoreMemory (uncached, WORD, data, pAddr, vAddr, DATA) 
Exceptions: 
Coprocessor Unusable 
TLB Refill 
TLB Invalid 
TLB Modified 


Address Error 
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Laude L. fmt Floating Point Tate to Long Fixed- Sone L. fmt 


26 25 21 20 16 oe a 10 
COP1 aOR = 
01 uate ct a 001 


Format: TRUNC.L.S fd, fs 
TRUNC.L.D fd, fs 


MIPS Ill 


Purpose: To convert an FP value to 64-bit fixed-point, rounding toward zero. 
Description: fd < convert_and_round (fs) 


The value in FPR fs in format fmt, is converted to a value in 64-bit long fixed-point format 
rounding toward zero (rounding mode 1). The result is placed in FPR fd. 


When the source value is Infinity, NaN, or rounds to an integer outside the range -2° to 
263 -1, the result cannot be represented correctly and an IEEE Invalid Operation condition 
exists. 


The Invalid Operation flag is set in the FCR31. If the Invalid Operation enable bit is set in 
the FCR381, no result is written to fd and an Invalid Operation exception is taken 
immediately. Otherwise, the default result, 26°—1, is written to fd. 


Restrictions: 


The fields fs and fd must specify valid FPRs; fs for type fmt and fd for long fixed-point; see 
Floating-Point Registers on page 10-2. If they are not valid, the result is undefined. 


Operation: 
StoreFPR (fd, L, ConvertFmt (ValueFPR (fs, fmt), fmt, L) 
Exceptions: 


Coprocessor Unusable 
Reserved Instruction 
Floating-Point 
Invalid Operation 
Unimplemented Operation 
Inexact 
Overflow 
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ars W. fmt Floating Point pee to Word Fixed- eee W. fmt 


26 25 21 20 16 c eal 10 


COP1 ane W 
01 we ct se 101 


Format: TRUNC.W.S fd, fs 
TRUNC.W.D fd, fs 


MIPS II 


Purpose: To convert an FP value to 32-bit fixed-point, rounding toward zero. 
Description: fd < convert_and_round (fs) 


The value in FPR fs in format fmt, is converted to a value in 32-bit word fixed-point 
format rounding toward zero (rounding mode 1). The result is placed in FPR fd. 


When the source value is Infinity, NaN, or rounds to an integer outside the range -2?! to 
231-1, the result cannot be represented correctly and an IEEE Invalid Operation condition 
exists. 


The Invalid Operation flag is set in the FCR31. If the Invalid Operation enable bit is set in 
the FCR381, no result is written to fd and an Invalid Operation exception is taken 
immediately. Otherwise, the default result, 2°!—1, is written to fd. 


Restrictions: 


The fields fs and fd must specify valid FPRs; fs for type fmt and fd for word fixed-point; 
see Floating-Point Registers on page 10-2. If they are not valid, the result is undefined. 


Operation: 
StoreFPR (fd, W, ConvertFmt (ValueFPR (fs, fmt), fmt, W) 


Exceptions: 


Coprocessor Unusable 
Reserved Instruction 
Floating-Point 
Invalid Operation 
Unimplemented Operation 
Inexact 
Overflow 
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D.4 COP1 Instruction Encoding 


31 26 0 


OpCode 
OpCode | bits 28..26 Instructions encoded by OpCode field (COP1, LWC1, SWC1, LDC1, SDC1) 


bits oa 1 2 3 4 5 6 of 
31..29 001 010 011 100 101 110 111 


m1 ee oc ee, oc 


31 26 25 21 0 


OpCode = 
COP1 
[os | bits 23..21 Instructions encoded by rs field when OpCode field = COP1 


bits 0 1 2 3 4 5 6 7 
25.24 000 001 010 011 100 101 110 111 


BC16 


| sé 


31 26 25 21 20 16 0 
OpCode = 5 
bits 18..16 Instructions encoded by rt field 
when OpCode field = COP1 & rs field = BC1 
bits 0 1 2 3 4 5 6 7 
20..19 000 001 010 011 100 101 110 111 
0 00 
1 01 
2 10 
3 11 
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26 25 


one = 
[ome ol | | __[rweton 


bits 2..0 Instructions encoded by function field 
when OpCode field = COP1 & rs field = S, D 


Sag 1 2 3 4 5 6 7 
001 010 011 100 101 110 111 


Coo sus [wut [ow] sort [nes [nov neo 


Pons poro|» |» fomm}on |» |. 
[atom [ea] a ed a [ood ec 


26 25 21 


oneaaee = 
[ome asm] [| _[weton 


bits 2..0 Instructions encoded by function field 
when OpCode field = COP1 & rs field = W, L 


0 1 2 3 4 5 6 7 
000 001 010 011 100 101 110 111 
as a el ae a ee a 


This OpCode is reserved for future use. An attempt to execute it causes a 
Reserved Instruction exception but this is not guaranteed. 

This OpCode is reserved for future use. An attempt to execute it produces 
an undefined result. The result may be an Unimplemented Operation 
exception. 

This OpCode indicates an instruction class. The instruction word must be 
further decoded by examining additional tables that show the values for 
another instruction field. 

This OpCode is reserved for one of the following instructions which are 
currently not supported: DMULT, DMULTU, DDIV, DDIVU, LL, LLD, SC, 
SCD, LWC2, SWC2. An attempt to execute it causes a Reserved Instruction 
exception. 
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